Keywords

1 Introduction

Background. Cryptographic hash functions are an important primitive in cryptography. They are classified into two classes: Unkeyed hash functions and keyed hash functions. The characteristic security requirement of an unkeyed hash function is collision resistance, which is the intractability of finding a pair of distinct inputs mapped to the same output. On the other hand, a keyed hash function is required to be a pseudorandom function (PRF) [10], which is indistinguishable from a uniform random function.

If a keyed hash function is a PRF satisfying collision resistance, then one can use it for computationally hiding and computationally binding string commitment. In addition, it has recently been shown that one can use it to achieve interesting cryptographic schemes such as compactly committing authenticated encryption with associated data (ccAEAD) [8, 11] and hash-based post-quantum EPID signatures [6]. In this paper, a keyed hash function satisfying collision resistance and the PRF property is called a collision-resistant and pseudorandom hash function.

HMAC [1] is a standardized keyed hash function in FIPS PUB 198-1 [9]. It is a collision-resistant and pseudorandom hash function. However, it is not so efficient for short message inputs.

Contribution. We present a method to construct a collision-resistant and pseudorandom hash function using a tweakable block cipher (TBC) in the TWEAKEY framework [19]. It is a kind of Merkle-Damgård iterated hash function [7, 20]. Its compression function adopts the double-block (DBL) construction [12] using a TBC to achieve sufficient level of collision resistance. Its domain extension extends KMDP\(^{+}\) [13] to achieve a PRF using the DBL compression function.

The proposed construction does not use the Merkle-Damgård strengthening for padding. Due to the feature, it achieves the minimum number of calls to its compression function for any message input under the assumption that the message input is fed only into the message-block input of its compression function.

The proposed construction is shown to be optimally collision-resistant in the ideal cipher model. It is also shown to be a secure PRF if the underlying TBC in the TWEAKEY framework is a secure tweakable pseudorandom permutation (PRP) in two tweakey strategies. In one tweakey strategy, the underlying TBC is required to be a secure tweakable PRP against related-key attacks. However, the related-key attacks are not so powerful in that the key-deriving functions are chosen by the designers.

Related Work. There have been proposals of keyed hash functions satisfying the PRF property and collision resistance: HMAC [1], EMD [4], Keyed-MDP [15], and KMDP\(^{+}\) [13]. All the constructions mentioned above except KMDP\(^{+}\) use the Merkle-Damgård strengthening for their padding. Thus, in terms of the number of calls to the underlying compression function, our proposed construction is more efficient than these constructions though they are competitive to ours.

The Merkle-Damgård hash function keyed via the initial value with prefix-free padding is shown to be a secure PRF if its compression function is a secure PRF [2]. Our proof on the PRF property is based on this proof.

Iwata and Kurosawa designed a CBC-MAC function called CMAC [21], which achieves the minimum number of calls to its block cipher [17, 18]. Though it is shown to be a secure PRF if its block cipher is a secure PRP, it is not aimed at collision resistance.

Organization. Notations and definitions are given in Sect. 2. The proposed construction is presented in Sect. 3. It is shown to satisfy collision resistance in the ideal cipher model in Sect. 4. It is shown to be a secure PRF if the underlying tweakable block cipher in the TWEAKEY framework is a secure tweakable PRP in two tweakey strategies in Sect. 5.

2 Preliminaries

Let \(\varSigma :=\{0,1\}\). Let \((\varSigma ^{n})^*:=\bigcup _{i\ge 0}\varSigma ^{ni}\) and \((\varSigma ^n)^{+}:=\bigcup _{i\ge 1}\varSigma ^{ni}\). Let \(\varepsilon \in \varSigma ^{0}\) be the empty sequence.

The length of a sequence \(x\in \varSigma ^{*}\) is denoted by |x|. The least significant bit of x is denoted by \(\textsf{lsb}(x)\). For sequences \(y_{i},y_{i+1},\ldots ,y_{i+j}\in \varSigma ^{*}\), their concatenation is denoted by \(y_{i}\Vert y_{i+1}\Vert \cdots \Vert y_{i+j}\) or \(y_{[i,i+j]}\).

For sequences \(x,y\in \varSigma ^{*}\), \(x\ \oplus \ y\) represents bit-wise XOR of x and y. If \(|x|>|y|\), then \(x\oplus y:=x \oplus (0^{|x|-|y|}\Vert y)\).

Let \(s\twoheadleftarrow \mathcal {S}\) represent that s is an element chosen uniformly at random from a set \(\mathcal {S}\).

For integers a, b, and d, let \(a\equiv _{d} b\) represent \(a\equiv b\pmod {d}\).

2.1 Cryptographic Hash Function

A cryptographic hash function is a function mapping an input of arbitrary length to an output of fixed length. It is often called simply a hash function. The characteristic security requirement of a hash function is collision resistance.

Let \(H^{P}\) be a hash function using a primitive P. In this paper, P is assumed to be an ideal TBC. Namely, P is chosen uniformly at random from the set of all TBCs with the same domain and range.

Let \(\textbf{A}\) be an adversary trying to find a colliding pair of inputs for \(H^{P}\), which are a pair of distinct inputs mapped to the same output. \(\textbf{A}\) can make encryption and decryption queries to its oracle P. The advantage of \(\textbf{A}\) against \(H^{P}\) for collision resistance is given by

$$ {{{\,\mathrm{\textrm{Adv}}\,}}}^{{{\,\mathrm{\textrm{col}}\,}}}_{H^{P}}(\textbf{A}):=\Pr [(X,X')\leftarrow \textbf{A}^{P}: H^{P}(X)=H^{P}(X')\wedge X\ne X']. $$

It is assumed that \(\textbf{A}\) makes all the queries to P necessary to compute both \(H^{P}(X)\) and \(H^{P}(X')\).

2.2 Pseudorandom Function

Let \(f:\mathcal {K}\times \mathcal {X}\rightarrow \mathcal {Y}\) be a keyed function with its key space \(\mathcal {K}\). A security requirement of f is indistinguishability from a uniform random function. The goal of an adversary \(\textbf{A}\) against f is to distinguish between \(f_K(\cdot ):=f(K,\cdot )\) and a random oracle \(\rho :\mathcal {X}\rightarrow \mathcal {Y}\), where \(K\twoheadleftarrow \mathcal {K}\). \(\textbf{A}\) has either \(f_K\) or \(\rho \) as an oracle and outputs 0 or 1. The advantage of \(\textbf{A}\) against f as a PRF is defined as

$$ {{{\,\mathrm{\textrm{Adv}}\,}}}^{{{\,\mathrm{\textrm{prf}}\,}}}_{f}(\textbf{A}):=\left| \Pr \left[ \textbf{A}^{f_K}=1\right] -\Pr \left[ \textbf{A}^{\rho }=1\right] \right| . $$

f is called a secure PRF if no efficient adversary \(\textbf{A}\) has any significant advantage. The advantage can be extended to adversaries with multiple oracles:

figure a

where \((K_{1},\ldots ,K_{p})\twoheadleftarrow \mathcal {K}^{p}\) and \(\rho _{1},\ldots ,\rho _{p}\) are independent random oracles.

2.3 Tweakable Block Cipher in TWEAKEY Framework

A TBC in the TWEAKEY framework is a function \(\tilde{e}:\varSigma ^{\nu }\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) with its tweakey space \(\varSigma ^{\nu }\) such that, for every \(Y\in \varSigma ^{\nu }\), \(\tilde{e}(Y,\cdot )\) is a permutation. We assume that \(\tilde{e}:\varSigma ^{\nu }\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) is a family of TBCs with their key space and tweak space \(\varSigma ^{\kappa }\) and \(\varSigma ^{\tau }\), respectively, satisfying \(\varSigma ^{\nu }=\varSigma ^{\kappa }\times \varSigma ^{\tau }\).

Let \(\mathcal {P}_{\tau ,n}\) be the set of all tweakable permutations over \(\varSigma ^{n}\) with their tweak space \(\varSigma ^{\tau }\). Namely, for every \(\varpi \in \mathcal {P}_{\tau ,n}\) and every \(T\in \varSigma ^{\tau }\), \(\varpi (T,\cdot )\) is a permutation over \(\varSigma ^{n}\).

A security requirement of a TBC \(e:\varSigma ^{\kappa }\times \varSigma ^{\tau }\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) is indistinguishability from a tweakable uniform random permutation. The advantage of an adversary \(\textbf{A}\) against e as a tweakable PRP (TPRP) is defined as

$$ {{{\,\mathrm{\textrm{Adv}}\,}}}_{e}^{{{\,\mathrm{\textrm{tprp}}\,}}}(\textbf{A}):=\bigl |\Pr [\textbf{A}^{e_K}=1]-\Pr [\textbf{A}^{\varpi }=1]\bigr |, $$

where \(K\twoheadleftarrow \varSigma ^{\kappa }\) and \(\varpi \twoheadleftarrow \mathcal {P}_{\tau ,n}\). \(\textbf{A}\) is allowed to make queries in \(\varSigma ^{\tau }\times \varSigma ^{n}\) adaptively to its oracle \(e_{K}\) or \(\varpi \) and outputs 0 or 1.

The following lemma is a kind of PRP/PRF switching lemma [5, 16] for a TBC.

Lemma 1

For any adversary \(\textbf{A}\) against a TBC \(e:\varSigma ^{\kappa }\times \varSigma ^{\tau }\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) taking at most t time and making at most q queries to its oracle, there exists an adversary \(\textbf{X}\) against e such that

$$\begin{aligned} {{\,\mathrm{\textrm{Adv}}\,}}_{e}^{{{\,\mathrm{\textrm{prf}}\,}}}(\textbf{A})\le {{\,\mathrm{\textrm{Adv}}\,}}_{e}^{{{\,\mathrm{\textrm{tprp}}\,}}}(\textbf{X})+q^{2}/{2^{n+1}} \end{aligned}$$

and \(\textbf{X}\) takes at most t time and makes at most q queries.

2.4 PRF and TPRP Under Related-Key Attack

Let \(f:\mathcal {K}\times \mathcal {X}\rightarrow \mathcal {Y}\). Let \({\varPhi }\) be a set of functions from \(\mathcal {K}\) to \(\mathcal {K}\). Let \(\textbf{A}\) be an adversary against f making a related-key attack restricted to \(\varPhi \) (\(\varPhi \)-RKA) [3]: \(\textbf{A}\) is given \(g[K]:{\varPhi }\times \mathcal {X}\rightarrow \mathcal {Y}\) such that \(g[K](\varphi ,X):=g(\varphi (K),X)\) as an oracle, where g is either f or a random oracle \(\rho :\mathcal {K}\times \mathcal {X}\rightarrow \mathcal {Y}\) and \(K\twoheadleftarrow \mathcal {K}\). \(\varphi \) is called a key-deriving function. The advantage of \(\textbf{A}\) against f as a PRF under a \(\varPhi \)-RKA is defined as

figure b

f is called a secure PRF under \(\varPhi \)-RKAs if no efficient adversary \(\textbf{A}\) has any significant advantage. The advantage of \(\textbf{A}\) with p oracles is defined as

figure c

where \((K_{1},\ldots ,K_{p})\twoheadleftarrow \mathcal {K}^{p}\) and \(\rho _{1},\ldots ,\rho _{p}\) are independent random oracles.

For a TBC e, and are defined similarly.

The following lemma is a kind of PRP/PRF switching lemma against adversaries making related-key attacks [14] for a TBC e.

Lemma 2

Let \(\textbf{A}\) be any adversary with p oracles against e taking at most t time and making at most \(q_{i}\) queries to its i-th oracle for \(1\le i\le p\). Let \(q:=q_{1}+\cdots +q_{p}\). Then, there exists an adversary \(\textbf{X}\) against e such that

figure f

and \(\textbf{X}\) takes at most \(t+O(q\,T_{e})\) time and makes at most \(\max \{q_{1},q_{2},\ldots ,q_{p}\}\) queries, where \(T_{e}\) represents the time required to compute e.

3 Proposed Construction

Let \(E:\varSigma ^{\nu }\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) be a TBC in the TWEAKEY framework such that \(\nu \ge 2n\). The proposed construction \(\textsf{C}^{E}:\varSigma ^{n}\times \varSigma ^{*}\rightarrow \varSigma ^{2n}\) is described in Algorithm 1. It is also depicted in Fig. 1. For the PRF property, \(\textsf{C}^{E}\) is viewed as a keyed function with its key space \(\varSigma ^{n}\). \(\textsf{C}^{E}\) incorporates constants \(IV\in \varSigma ^{n}\) and \(c_{00},c_{01},c_{10},c_{11},\delta \in \varSigma ^{n}\setminus \{0^{n}\}\). \(\delta \) is a constant such that \(\textsf{lsb}(\delta )=1\). \(c_{00},c_{01},c_{10},c_{11}\) are distinct from each other. \(\textsf{cf}^{E}:\varSigma ^{2n}\times \varSigma ^{\nu -n}\rightarrow \varSigma ^{2n}\) is a compression function such that

$$\begin{aligned} \textsf{cf}^{E}(V_{i-1},M_{i}):=E(V_{i-1}\Vert M_{i,0},M_{i,1})\Vert E(V_{i-1}\Vert M_{i,0},M_{i,1}\oplus \delta ), \end{aligned}$$

where \(M_{i}=M_{i,0}\Vert M_{i,1}\) and \(|M_{i,1}|=n\). If \(\nu =2n\), then \(M_{i,0}=\varepsilon \). \(\texttt{pad}:\varSigma ^{*}\rightarrow (\varSigma ^{\nu -n})^{+}\) is a padding function such that

$$ \texttt{pad}(M):= {\left\{ \begin{array}{ll} M &{} \text {if }|M|>0 \text { and } |M|\equiv _{\nu -n} 0\\ M\Vert 10^{a} &{} \text {otherwise}, \end{array}\right. } $$

where a is the non-negative integer such that \(|\texttt{pad}(M)|\) is the smallest multiple of \(\nu -n\). Notice that \(\texttt{pad}(\varepsilon )=10^{\nu -n-1}\).

Remark 1

Let \(\textsf{sw}\) be a permutation over \(\varSigma ^{n}\times \varSigma ^{n}\) such that \((x_{0},x_{1})\mapsto (x_{1},x_{0})\). Then, for any \((V_{i-1},M_{i})\in \varSigma ^{2n}\times \varSigma ^{\nu -n}\), \(\textsf{cf}^{E}(V_{i-1},M_{i}\oplus \delta )=\textsf{sw}(\textsf{cf}^{E}(V_{i-1},M_{i}))\). For the PRF property, \(\textsf{C}^{E}\) is designed so that the reflectiveness of \(\textsf{cf}^{E}\) does not appear at the last call to \(\textsf{cf}^{E}\). Notice that \(\textsf{lsb}(M_{m})\not =\textsf{lsb}(M_{m}\oplus \delta )\).

figure g
Fig. 1.
figure 1

The proposed construction

4 Collision Resistance

Any adversary needs \({\varOmega }(2^{n})\) queries to find a colliding pair of inputs for \(\textsf{C}^{E}\) under the assumption that \(E\) is an ideal cipher:

Theorem 1

For any adversary \(\textbf{A}\) making at most q queries,

$$\begin{aligned} {{\,\mathrm{\textrm{Adv}}\,}}_{\textsf{C}^{E}}^{{{\,\mathrm{\textrm{col}}\,}}}(\textbf{A})\le 7q/(2^{n}-2q)+5q(q-1)/(2^{n}-2q)^{2}. \end{aligned}$$

Proof

Suppose that \(\textbf{A}\) finds a colliding pair of inputs, (KM) and \((K',M')\) for \(\textsf{C}^{E}\). Then, \(\textsf{C}^{E}(K,M)=\textsf{C}^{E}(K',M')\) and \((K,M)\ne (K',M')\). Without loss of generality, suppose that \(|M|\le |M'|\). Let \(\texttt{pad}(M)=M_{1}\Vert M_{2}\Vert \cdots \Vert M_{m}\) and \(\texttt{pad}(M')=M_{1}'\Vert M_{2}'\Vert \cdots \Vert M_{m'}'\).

  1. (i)

    Suppose that \(m=m'=1\). If \(K\ne K'\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\). Otherwise, \(M\ne M'\).

    • If \(|M|=|M'|=\nu - n\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) since \(\texttt{pad}(M)\ne \texttt{pad}(M')\).

    • If \(|M|<\nu -n\) and \(|M'|<\nu -n\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) since \(\texttt{pad}(M)\ne \texttt{pad}(M')\).

    • If \(|M|<\nu -n\) and \(|M'|=\nu -n\), then

      $$\begin{aligned} \textsf{cf}^{E}(K\Vert ( IV \oplus c),M_{1})=\textsf{cf}^{E}(K\Vert ( IV \oplus c'),M_{1}'), \end{aligned}$$

      where \(c\in \{c_{10},c_{11}\}\) and \(c'\in \{c_{00},c_{01}\}\). Thus, \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) since \(\{c_{10},c_{11}\}\cap \{c_{00},c_{01}\}=\emptyset \).

  2. (ii)

    Suppose that \(m=1\) and \(m'\ge 2\). Then,

    $$\begin{aligned} \textsf{cf}^{E}(K\Vert ( IV \oplus c),M_{1})=\textsf{cf}^{E}(V_{m'-1}'\oplus c',M_{m'}'), \end{aligned}$$

    where

    $$\begin{aligned} c\in {\left\{ \begin{array}{ll} \{c_{00},c_{01}\} &{} \text {if }|M|=\nu -n\text {,}\\ \{c_{10},c_{11}\} &{} \text {if }|M|<\nu -n\text {,} \end{array}\right. }\qquad c'\in {\left\{ \begin{array}{ll} \{c_{00},c_{01}\} &{} \text {if }|M'|\equiv _{\nu -n} 0\text {,}\\ \{c_{10},c_{11}\} &{} \text {if }|M'|\not \equiv _{\nu -n} 0\text {.} \end{array}\right. } \end{aligned}$$

    If \((K\Vert ( IV \oplus c),M_{1})\ne (V_{m'-1}'\oplus c',M_{m'}')\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\). Otherwise, \(M_{1}=M_{m'}'\) and the least significant \(n\) bits of \(V_{m'-1}'\) equals \( IV \oplus c\oplus c'\). Thus, \(\textbf{A}\) finds an input for \(\textsf{cf}^{E}\) such that the least significant \(n\) bits of the corresponding output equals

    • \( IV \) if \(|M|=\nu -n\) and \(|M'|\equiv _{\nu -n} 0\),

    • \( IV \) if \(|M|<\nu -n\) and \(|M'|\not \equiv _{\nu -n} 0\), and

    • \( IV \oplus c_{00}\oplus c_{10}\) or \( IV \oplus c_{01}\oplus c_{11}\) otherwise.

  3. (iii)

    Suppose that \(m\ge 2\) and \(m'\ge 2\). Then,

    $$\begin{aligned} \textsf{cf}^{E}(V_{m-1}\oplus c,M_{m})=\textsf{cf}^{E}(V_{m'-1}'\oplus c',M_{m'}') \end{aligned}$$

    where

    $$\begin{aligned} c\in {\left\{ \begin{array}{ll} \{c_{00},c_{01}\} &{} \text {if }|M|\equiv _{\nu -n} 0\text {,}\\ \{c_{10},c_{11}\} &{} \text {if }|M|\not \equiv _{\nu -n} 0\text {,} \end{array}\right. }\qquad c'\in {\left\{ \begin{array}{ll} \{c_{00},c_{01}\} &{} \text {if }|M'|\equiv _{\nu -n} 0\text {,}\\ \{c_{10},c_{11}\} &{} \text {if }|M'|\not \equiv _{\nu -n} 0\text {.} \end{array}\right. } \end{aligned}$$

If \((V_{m-1}\oplus c,M_{m})\ne (V_{m'-1}'\oplus c',M_{m'}')\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\). Otherwise, \(V_{m-1}\oplus V_{m'-1}'=0^{n}\Vert (c\oplus c')\) and \(M_{m}=M_{m'}'\).

  • If \(|M|\equiv _{\nu -n} 0\) and \(|M'|\not \equiv _{\nu -n} 0\), or \(|M|\not \equiv _{\nu -n} 0\) and \(|M'|\equiv _{\nu -n} 0\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) wrt \(0^{n}\Vert (c_{00}\oplus c_{10})\) or \(0^{n}\Vert (c_{01}\oplus c_{11})\), that is, a pair of inputs \((V_{m-2},M_{m-1})\) and \((V_{m'-2}',M_{m'-1}')\) such that \(\textsf{cf}^{E}(V_{m-2},M_{m-1})\oplus \textsf{cf}^{E}(V_{m'-2}',M_{m'-1}')\) equals \(0^{n}\Vert (c_{00}\oplus c_{10})\) or \(0^{n}\Vert (c_{01}\oplus c_{11})\).

  • Suppose that \(|M|\equiv _{\nu -n} 0\) and \(|M'|\equiv _{\nu -n} 0\). Then, since \(M_{m}=M_{m'}'\), \(c=c'\) and \(V_{m-1}=V_{m'-1}'\). If \(m=m'\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) since \((K,M)\not =(K',M')\). If \(m<m'\), then \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) or an input for \(\textsf{cf}^{E}\) such that the least significant \(n\) bits of the corresponding output equals \( IV \).

  • Suppose that \(|M|\not \equiv _{\nu -n} 0\) and \(|M'|\not \equiv _{\nu -n} 0\). This case is similar to the case above. Thus, \(\textbf{A}\) finds a colliding pair for \(\textsf{cf}^{E}\) or an input for \(\textsf{cf}^{E}\) such that the least significant \(n\) bits of the corresponding output equals \( IV \).

Thus, a colliding pair for \(\textsf{C}^{E}\) implies for \(\textsf{cf}^{E}\)

  1. 1.

    a colliding pair,

  2. 2.

    a colliding pair wrt \(0^{n}\Vert (c_{00}\oplus c_{10})\) or \(0^{n}\Vert (c_{01}\oplus c_{11})\), or

  3. 3.

    an input mapped to an output whose least significant \(n\) bits equals \( IV \), \( IV \oplus c_{00}\oplus c_{10}\) or \( IV \oplus c_{01}\oplus c_{11}\).

Let us consider an adversary \(\tilde{\textbf{A}}\) running \(\textbf{A}\). For each query \(\textbf{A}\), \(\tilde{\textbf{A}}\) makes at most 2 queries. Thus, \(\tilde{\textbf{A}}\) makes at most 2q queries in total.

For a query of \(\textbf{A}\), if \(\tilde{\textbf{A}}\) already knows the corresponding answer, then \(\tilde{\textbf{A}}\) simply returns it to \(\textbf{A}\). Suppose that \(\tilde{\textbf{A}}\) does not know the answer. If the query of \(\textbf{A}\) is an encryption query \(( TK , PT )\), then \(\tilde{\textbf{A}}\) asks \(( TK , PT )\) and \(( TK , PT \oplus \delta )\) to \(E\) and receives replies \( CT \) and \( CT '\), respectively. Then, \(\tilde{\textbf{A}}\) returns \( CT \) to \(\textbf{A}\). If the query of \(\textbf{A}\) is a decryption query \(( TK , CT )\), then \(\tilde{\textbf{A}}\) asks \(( TK , CT )\) to \(E^{-1}\), receives a reply \( PT \) and returns it to \(\textbf{A}\). Then, \(\tilde{\textbf{A}}\) asks \(( TK , PT \oplus \delta )\) to \(E\) and receives a reply \( CT '\). In both of the cases, \(\tilde{\textbf{A}}\) gets \(( TK , PT , CT )\) and \(( TK , PT \oplus \delta , CT ')\). Notice that \(\textsf{cf}^{E}( TK_{\textrm{v}} , TK_{\textrm{m}} \Vert PT )=( CT \oplus PT )\Vert ( CT '\oplus PT \oplus \delta )\) and \(\textsf{cf}^{E}( TK_{\textrm{v}} , TK_{\textrm{m}} \Vert ( PT \oplus \delta ))=( CT '\oplus PT \oplus \delta )\Vert ( CT \oplus PT )\), where \( TK = TK_{\textrm{v}} \Vert TK_{\textrm{m}} \) and \(| TK_{\textrm{v}} |=2n\). Also notice that, if \( CT \oplus CT '=\delta \), then \(\textsf{cf}^{E}( TK_{\textrm{v}} , TK_{\textrm{m}} \Vert PT )=\textsf{cf}^{E}( TK_{\textrm{v}} , TK_{\textrm{m}} \Vert ( PT \oplus \delta ))\).

Let \(( TK _{j}, PT _{j}, CT _{j})\) and \(( TK _{j}, PT _{j}\oplus \delta , CT '_{j})\) be the tuples obtained by \(\tilde{\textbf{A}}\) for the j-th query of \(\textbf{A}\). Let \(U_{j}:= CT _{j}\oplus PT _{j}\) and \(U'_{j}:= CT '_{j}\oplus PT _{j}\oplus \delta \). Then, for an execution of \(\tilde{\textbf{A}}\), the j-th query of \(\textbf{A}\) induces 1 or 2 above for \(\textsf{cf}^{E}\) if \(U_{j}=U'_{j}\) or there exists some \(j'<j\) such that

  • \(U_{j}\Vert U'_{j}\in \{U_{j'}\Vert U'_{j'},U_{j'}\Vert (U'_{j'}\oplus c_{00}\oplus c_{10}),U_{j'}\Vert (U'_{j'}\oplus c_{01}\oplus c_{11})\}\),

  • \(U_{j}\Vert U'_{j}\in \{U'_{j'}\Vert U_{j'},U'_{j'}\Vert (U_{j'}\oplus c_{00}\oplus c_{10}),U'_{j'}\Vert (U_{j'}\oplus c_{01}\oplus c_{11})\}\),

  • \(U'_{j}\Vert U_{j}\in \{U_{j'}\Vert U'_{j'},U_{j'}\Vert (U'_{j'}\oplus c_{00}\oplus c_{10}),U_{j'}\Vert (U'_{j'}\oplus c_{01}\oplus c_{11})\}\), or

  • \(U'_{j}\Vert U_{j}\in \{U'_{j'}\Vert U_{j'},U'_{j'}\Vert (U_{j'}\oplus c_{00}\oplus c_{10}),U'_{j'}\Vert (U_{j'}\oplus c_{01}\oplus c_{11})\}\).

Thus, the probability that the j-th query of \(\textbf{A}\) induces 1 or 2 above for \(\textsf{cf}^{E}\) is at most \(10(j-1)/(2^{n}-2q)^{2}+1/(2^{n}-2q)\). The probability that it induces 3 above for \(\textsf{cf}^{E}\) is at most \(6/(2^{n}-2q)\). Since \(\textbf{A}\) makes at most q queries,

$$\begin{aligned} \sum _{j=1}^{q}(10(j-1)/(2^{n}-2q)^{2}+7/(2^{n}-2q))\le 7q/(2^{n}-2q)+5q(q-1)/(2^{n}-2q)^{2}. \end{aligned}$$

It is also an upper bound on \({{\,\mathrm{\textrm{Adv}}\,}}_{\textsf{C}^{E}}^{{{\,\mathrm{\textrm{col}}\,}}}(\textbf{A})\).    \(\square \)

5 Pseudorandom-Function Property

The proposed construction \(\textsf{C}^{E}\) treats the TBC \(E:\varSigma ^{\nu }\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) in the TWEAKEY framework in two tweakey strategies: \(\varSigma ^{\nu }:=\varSigma ^{n}\times \varSigma ^{\nu -n}\) in one strategy and \(\varSigma ^{\nu }:=\varSigma ^{2n}\times \varSigma ^{\nu -2n}\) in the other strategy. We denote \(E\) in the former and the latter tweakey strategies by \(\dot{E}\) and \(\ddot{E}\), respectively.

For \(\ddot{E}\), we consider related-key attacks with related-key-deriving functions \({\varPhi }:=\{\textsf{id},\textsf{sw},\textsf{x}_{c_{00}},\textsf{x}_{c_{01}},\textsf{x}_{c_{10}},\textsf{x}_{c_{11}},\textsf{sw}\circ \textsf{x}_{c_{00}},\textsf{sw}\circ \textsf{x}_{c_{01}},\textsf{sw}\circ \textsf{x}_{c_{10}},\textsf{sw}\circ \textsf{x}_{c_{11}}\}\), where

  • \(\textsf{id}\) is the identity permutation over \(\varSigma ^{2n}\),

  • \(\textsf{sw}\) is a permutation over \(\varSigma ^{n}\times \varSigma ^{n}\) such that \((x_{0},x_{1})\mapsto (x_{1},x_{0})\), and

  • for \(c\in \varSigma ^{n}\), \(\textsf{x}_{c}\) is a permutation over \(\varSigma ^{2n}\) such that \(x\mapsto x\oplus c\).

\(\textsf{C}^{E}\) is a secure PRF if \(\dot{E}\) is a secure TPRP and \(\ddot{E}\) is a secure TPRP under \(\varPhi \)-related-key attacks:

Theorem 2

For any adversary \(\textbf{A}\) against \(\textsf{C}^{E}\) taking at most t time and making at most q queries each of which has at most \(\ell \) blocks after padding, there exist adversaries \(\dot{\textbf{A}}\) against \(\dot{E}\) and \(\ddot{\textbf{A}}\) against \(\ddot{E}\) such that

figure h

Both \(\dot{\textbf{A}}\) and \(\ddot{\textbf{A}}\) take at most about \(t+O(\ell q\,T_{E})\) time and make at most 2q queries, where \(T_{E}\) is time required to compute \(E\).

Proof

Let \(\textsf{I}^{c}:\varSigma ^{2n}\times (\varSigma ^{\nu -n})^{+}\rightarrow \varSigma ^{2n}\) be a keyed function specified in Algorithm 2. For an integer \(k\ge 0\) and functions \(\zeta :\varSigma ^{*}\rightarrow \varSigma ^{2n}\) and \(\eta :(\varSigma ^{\nu -n})^{*}\rightarrow \varSigma ^{2n}\), let \(\textsf{Hy}[k]^{\zeta ,\eta }:\varSigma ^{*}\rightarrow \varSigma ^{2n}\) be a function specified as follows: For \(M\in \varSigma ^{*}\) such that \(\texttt{pad}(M)=M_{1}\Vert M_{2}\Vert \cdots \Vert M_{m}\) and \(|M_{i}|=\nu -n\) for \(1\le i\le m\),

$$\begin{aligned} \textsf{Hy}[k]^{\zeta ,\eta }(M):= {\left\{ \begin{array}{ll} \zeta (M) &{} \text {if }m\le k\text {,} \\ \textsf{I}^{c}(\eta (M_{[1,k]}),M_{[k+1,m]}) &{} \text {if }m>k\text {,}\\ \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} c\leftarrow {\left\{ \begin{array}{ll} c_{00} &{} \text {if }|M|>0\wedge |M|\equiv _{\nu -n} 0\wedge \textsf{lsb}(M_{m})=0\text {,}\\ c_{01} &{} \text {if }|M|>0\wedge |M|\equiv _{\nu -n} 0\wedge \textsf{lsb}(M_{m})=1\text {,}\\ c_{10} &{} \text {if }(|M|=0\vee |M|\not \equiv _{\nu -n} 0)\wedge \textsf{lsb}(M_{m})=0\text {,}\\ c_{11} &{} \text {if }(|M|=0\vee |M|\not \equiv _{\nu -n} 0)\wedge \textsf{lsb}(M_{m})=1\text {.} \end{array}\right. } \end{aligned}$$
(1)

Notice that \(M_{[1,0]}=\varepsilon \).

Suppose that \(\zeta \) is a random oracle and \(\eta \) is a random function such that

  • \(\eta (\varepsilon )\twoheadleftarrow \varSigma ^{n}\times \{ IV \}\), and

  • for every \(k\ge 1\) and \(M_{[1,k]}\) such that \(\textsf{lsb}(M_{k})=0\), \(\eta (M_{[1,k]})\twoheadleftarrow \varSigma ^{2n}\) and \(\eta (M_{[1,k]}\oplus \delta )\leftarrow \textsf{sw}(\eta (M_{[1,k]}))\).

Then,

$$\begin{aligned} \textsf{Hy}[0]^{\zeta ,\eta }(M)=\textsf{I}^{c}(\eta (\varepsilon ),M_{[1,m]}) \end{aligned}$$

and c is chosen as specified by Formula (1). Thus, \(\textsf{Hy}[0]^{\zeta ,\eta }\) is equivalent to \(\textsf{C}^{E}\). \(\textsf{Hy}[\ell ]^{\zeta ,\eta }\) works as a random oracle for any \(M\in \varSigma ^{*}\) such that \(\texttt{pad}(M)\) consists of at most \(\ell \) blocks. Since every query made by \(\textbf{A}\) is assumed to consist of at most \(\ell \) blocks after padding,

$$\begin{aligned} {{\,\mathrm{\textrm{Adv}}\,}}_{\textsf{C}^{E}}^{{{\,\mathrm{\textrm{prf}}\,}}}(\textbf{A})=\bigl |\Pr [\textbf{A}^{\textsf{Hy}[0]^{\zeta ,\eta }}=1]-\Pr [\textbf{A}^{\textsf{Hy}[\ell ]^{\zeta ,\eta }}=1]\bigr |. \end{aligned}$$

Let \(\varDelta _{k}:=\bigl |\Pr [\textbf{A}^{\textsf{Hy}[k]^{\zeta ,\eta }}=1]-\Pr [\textbf{A}^{\textsf{Hy}[k+1]^{\zeta ,\eta }}=1]\bigr |\). Then,

$$\begin{aligned} {{\,\mathrm{\textrm{Adv}}\,}}_{\textsf{C}^{E}}^{{{\,\mathrm{\textrm{prf}}\,}}}(\textbf{A})\le \varDelta _{0}+\varDelta _{1}+\cdots +\varDelta _{\ell -1}. \end{aligned}$$
(2)

For \(\varDelta _{0}\), let \(\textbf{D}_{0}\) be an adversary against \(\dot{E}\). \(\textbf{D}_{0}\) runs \(\textbf{A}\) and produces the same output as \(\textbf{A}\). It also simulates the oracle of \(\textbf{A}\) using its oracle. Let \(\dot{F}:\varSigma ^{\nu -n}\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) be the oracle of \(\textbf{D}_{0}\), which is either \(\dot{E}_{K}\) or \(\dot{\rho }\), where \(K\twoheadleftarrow \varSigma ^{n}\) and \(\dot{\rho }\) is a random oracle. For each query M of \(\textbf{A}\) such that \(\texttt{pad}(M)=M_{1}\Vert \cdots \Vert M_{m}\), \(\textbf{D}_{0}\) acts as follows: For \(1\le i\le m\), \(M_{i}:=M_{i,0}\Vert M_{i,1}\), where \(|M_{i,0}|=\nu -2n\) and \(|M_{i,1}|=n\).

  • If \(m=1\), then \(\textbf{D}_{0}\) asks \((( IV \oplus c)\Vert M_{1,0},M_{1,1})\) and \((( IV \oplus c)\Vert M_{1,0},M_{1,1}\oplus \delta )\) to \(\dot{F}\) and returns \(\textsf{cf}^{\dot{F}}(K\Vert ( IV \oplus c),M_{1})\) to \(\textbf{A}\).

  • If \(m\ge 2\), then \(\textbf{D}_{0}\) asks \(( IV \Vert M_{1,0},M_{1,1})\) and \(( IV \Vert M_{1,0},M_{1,1}\oplus \delta )\) to \(\dot{F}\) and returns \(\textsf{I}^{c}(\textsf{cf}^{\dot{F}}(K\Vert IV ,M_{1}),M_{[2,m]})\) to \(\textbf{A}\).

In both of the cases above, c is chosen as specified by Formula (1). \(\textbf{D}_{0}\) implements \(\textsf{Hy}[0]^{\zeta ,\eta }\) as the oracle of \(\textbf{A}\) if its oracle is \(\dot{E}_{K}\). It implements \(\textsf{Hy}[1]^{\zeta ,\eta }\) if its oracle is \(\dot{\rho }\) since \(\dot{\rho }(( IV \oplus c_{00})\Vert M_{1,0},M_{1,1})\), \(\dot{\rho }(( IV \oplus c_{01})\Vert M_{1,0},M_{1,1})\), \(\dot{\rho }(( IV \oplus c_{10})\Vert M_{1,0},M_{1,1})\), \(\dot{\rho }(( IV \oplus c_{11})\Vert M_{1,0},M_{1,1})\) and \(\dot{\rho }( IV \Vert M_{1,0},M_{1,1})\) are independent from each other. Thus,

$$\begin{aligned} \varDelta _{0}=\bigl |\Pr [\textbf{D}_{0}^{\dot{E}_{K}}=1]-\Pr [\textbf{D}_{0}^{\dot{\rho }}=1]\bigr | ={{\,\mathrm{\textrm{Adv}}\,}}_{\dot{E}}^{{{\,\mathrm{\textrm{prf}}\,}}}(\textbf{D}_{0}). \end{aligned}$$
(3)

\(\textbf{D}_{0}\) takes at most about \(t+O(\ell q\,T_{E})\) time and makes at most 2q queries.

Suppose that \(1\le k\le \ell -1\). For \(\varDelta _{k}\), let \(\textbf{D}_{k}\) be an adversary making a \(\varPhi \)-related-key attack on \(\ddot{E}\). \(\textbf{D}_{k}\) runs \(\textbf{A}\) and produces the same output as \(\textbf{A}\). It also simulates the oracle of \(\textbf{A}\) using its oracle. \(\textbf{D}_{k}\) has q oracles \(\ddot{F}_{i}[K_{i}]:\varSigma ^{\nu -2n}\times \varSigma ^{n}\rightarrow \varSigma ^{n}\) for \(1\le i\le q\). They are either \(\ddot{E}[K_{1}],\ldots ,\ddot{E}[K_{q}]\) or \(\ddot{\rho }_{1}[K_{1}],\ldots ,\ddot{\rho }_{q}[K_{q}]\), where \(K_{i}\twoheadleftarrow \varSigma ^{2n}\) and \(\ddot{\rho }_{i}\) is a random oracle for \(1\le i\le q\). For the j-th query M of \(\textbf{A}\), let \(\texttt{pad}(M)=M_{1}\Vert \cdots \Vert M_{m}\), where \(M_{i}:=M_{i,0}\Vert M_{i,1}\), \(|M_{i,0}|=\nu -2n\) and \(|M_{i,1}|=n\) for \(1\le i\le m\). Suppose that \(m\le k\). Then, \(\textbf{D}_{k}\) simulates \(\zeta \) and returns \(\zeta (M)\) to \(\textbf{A}\). Suppose that \(m>k\). Let \(\mathcal {J}\) be a set of integers \(j'\) such that \(j'<j\) and the \(j'\)-th query \(M'\) of \(\textbf{A}\) satisfies \(m'>k\) and \(M_{[1,k]}'=M_{[1,k]}\vee M_{[1,k]}'=M_{[1,k-1]}\Vert M_{k,0}\Vert (M_{k,1}\oplus \delta )\), where \(\texttt{pad}(M')=M_{1}'\Vert \cdots \Vert M_{m'}'\). Let \(j^{*}\leftarrow j\) if \(\mathcal {J}=\emptyset \) and \(j^{*}\leftarrow \min \mathcal {J}\) otherwise. Let \(M^{*}\) be the \(j^{*}\)-th query of \(\textbf{A}\). Then, for the j-th query M of \(\textbf{A}\), \(\textbf{D}_{k}\) acts as follows:

  • Suppose that \(m=k+1\). Then,

    • \(\textbf{D}_{k}\) asks \((\textsf{x}_{c},M_{k+1,0},M_{k+1,1})\) and \((\textsf{x}_{c},M_{k+1,0},M_{k+1,1}\oplus \delta )\) to \(\ddot{F}_{j^{*}}[K_{j^{*}}]\) and returns \(\textsf{cf}^{\ddot{F}_{j^{*}}}(K_{j^{*}}\oplus c,M_{k+1})\) to \(\textbf{A}\) if \(j^{*}=j\) or \(j^{*}<j\wedge M_{[1,k]}^{*}=M_{[1,k]}\), and

    • \(\textbf{D}_{k}\) asks \((\textsf{sw}\circ \textsf{x}_{c},M_{k+1,0},M_{k+1,1})\) and \((\textsf{sw}\circ \textsf{x}_{c},M_{k+1,0},M_{k+1,1}\oplus \delta )\) to \(\ddot{F}_{j^{*}}[K_{j^{*}}]\) and returns \(\textsf{cf}^{\ddot{F}_{j^{*}}}(\textsf{sw}(K_{j^{*}})\oplus c,M_{k+1})\) to \(\textbf{A}\) otherwise.

    In both of the cases above, c is chosen as specified by Formula (1).

  • Suppose that \(m\ge k+2\). Then,

    • \(\textbf{D}_{k}\) asks \((\textsf{id},M_{k+1,0},M_{k+1,1})\) and \((\textsf{id},M_{k+1,0},M_{k+1,1}\oplus \delta )\) to \(\ddot{F}_{j^{*}}[K_{j^{*}}]\) and returns \(\textsf{I}^{c}(\textsf{cf}^{\ddot{F}_{j^{*}}}(K_{j^{*}},M_{k+1}),M_{[k+2,m]})\) to \(\textbf{A}\) if \(j^{*}=j\) or \(j^{*}<j\wedge M_{[1,k]}^{*}=M_{[1,k]}\), and

    • \(\textbf{D}_{k}\) asks \((\textsf{sw},M_{k+1,0},M_{k+1,1})\) and \((\textsf{sw},M_{k+1,0},M_{k+1,1}\oplus \delta )\) to \(\ddot{F}_{j^{*}}[K_{j^{*}}]\) and returns \(\textsf{I}^{c}(\textsf{cf}^{\ddot{F}_{j^{*}}}(\textsf{sw}(K_{j^{*}}),M_{k+1}),M_{[k+2,m]})\) to \(\textbf{A}\) otherwise.

    In both of the cases above, c is chosen as specified by Formula (1).

In the process above, for the j-th query M, if \(M_{[1,k]}\) is new, that is, \(\mathcal {J}=\emptyset \), then \(\textbf{D}_{k}\) uses the new oracle \(\ddot{F}_{j}[K_{j}]\) to compute the answer to the query. It implies that new \(K_{j}\), which is chosen uniformly at random, is assigned to new \(M_{[1,k]}\). Suppose that the oracles of \(\textbf{D}_{k}\) are \(\ddot{E}[K_{1}],\ldots ,\ddot{E}[K_{q}]\). Then, \(\textbf{D}_{k}\) implements \(\textsf{Hy}[k]^{\zeta ,\eta }\) as the oracle of \(\textbf{A}\). On the other hand, suppose that the oracles of \(\textbf{D}_{k}\) are \(\ddot{\rho }_{1}[K_{1}],\ldots ,\ddot{\rho }_{q}[K_{q}]\). Then, \(\textbf{D}_{k}\) implements \(\textsf{Hy}[k+1]^{\zeta ,\eta }\) if \(K_{i}\not =\textsf{sw}(K_{i})\) for every i sich that \(1\le i\le q\). Thus,

(4)

\(\textbf{D}_{k}\) takes at most about \(t+O(\ell q\,T_{E})\) time and makes at most 2q queries.

From Inequality (2), Equalities (3) and (4), and Lemmas 1 and 2, there exist adversaries \(\dot{\textbf{A}}\) and \(\ddot{\textbf{A}}\) such that

figure i

Both \(\dot{\textbf{A}}\) and \(\ddot{\textbf{A}}\) take at most about \(t+O(\ell q\,T_{E})\) time and make at most 2q queries.    \(\square \)

figure j