Keywords

1 Introduction

Nonce-Based Authenticated Encryption. Authenticated encryption (AE) schemes aim at protecting both the privacy and the integrity of submitted messages. Authenticated encryption schemes that allow to authenticate not only the encrypted message, but also associated data, are commonly known as AEAD schemes [22]. The common security notions for AE schemes concern the prevention of any leakage about the encrypted messages except for their lengths. Since stateless schemes would enable adversaries to detect a duplicate encryption of the same associated data and message under the current key, Rogaway proposed nonce-based encryption [24], where the user provides an additional nonce for every message she wants to process. In theory, the concept of nonces is simple. However, the practice has shown numerous examples of implementation failures, and settings that render it difficult to almost impossible to prevent nonce reuse (cf. [8]). Before the CAESAR competition, the majority of widely used AE schemes protected neither the confidentiality nor the integrity of messages in the case of nonce repetitions. As a consequence, a considerable number of CAESAR candidates aimed a certain level of security if nonces repeat (e.g., [1, 9, 10, 15]).

Parallelizable MACs in Authenticated Encryption. Block-cipher-based message authentication codes (MACs) are important components not only for authentication, but also as part of AE schemes, where they are used to derive an initialization vector (IV) that is then used for encryption. In particular, parallelizable MACs like PMAC [6] allow to process multiple blocks in parallel, which is beneficial for software performance on current x64 processors. Since PMAC has several further desirable properties, e. g. being online and incremental, it is not a surprise that all the CAESAR candidates cited above essentially combine a variant of PMAC (or its underlying hash function) with a block-cipher-based mode of operation for efficiently processing associated data and message.

Beyond-Birthday-Bound AE. Besides performance, the quantitative security guarantees are important aspects for AE schemes. The privacy and authenticity guarantees of the AE schemes cited above are limited by the birthday bound of \(O(\ell ^2/2^n)\), where n denotes the state size of the underlying primitive, and \(\ell \) the number of blocks processed over all queries. Since the schemes above possess an n-bit state, a state collision that leads to attacks has significant probability after approximately \(2^{n/2}\) blocks have been processed under the same key.

To address this issue, Peyrin and Seurin presented Synthetic Counter in Tweak (SCT) [20], a beyond-birthday-bound (BBB) AE scheme based on a tweakable block cipher under a single key. Internally, SCT is a MAC-then-Encrypt composition: the MAC part is a PMAC-like construction, called EPWC. The encryption part is Counter-in-Tweak (CTRT), an efficient mode of operation that takes an n-bit nonce and an n-bit IV. Both EPWC and CTRT guarantee BBB security as long as nonces never repeat. However, the security of the nonce-IV-based CTRT degrades to the birthday bound with an increasing number of nonce reuses; even worse, the security of EPWC (and consequently that of SCT) immediately reduces to the birthday bound if a single nonce repeats once. In [21, p. 7], the extended version of [20], the authors remarked therefore (among others) the following open problem: “[...] to construct an AE scheme which remains BBB-secure even when nonces are arbitrarily repeated. The main difficulty is to build a deterministic, stateless, BBB-secure MAC, which is known to be notably hard”. Intuitively, an efficient block-cipher-based BBB-secure MAC with 2n bit output length could allow to construct such a deterministic AE scheme.Footnote 1 Thus, this work will put a large focus on the construction of BBB-secure MACs.

Previous Work. Naito [17] proposed two MACs with full PRF security based on a tweakable block cipher: PMAC_TBC3k and PMAC_TBC1k. While the former requires three keys, the latter uses tweak-based domain separation to require only a single key. Extending the latter seemed a well-suited starting point for our work since such a MAC could be combined in straight-forward manner with a BBB-secure mode of encryption. Though, during our studies, we found that the analysis in [17] assumed internal values to be independent, which—as we will show—cannot always be guaranteed. Since the proof depended largely on this aspect, we developed an alternative analysis for our construction and derived a corrected bound for a PMAC_TBC1k-like variant with n-bit output. So, despite the assumption in the original proof, we confirm that Naito’s MAC is secure for close to \(2^{n-2}\) blocks processed under the same key.

Contribution. Our contributions are threefold: first, we propose a BBB-secure parallelizable MAC, called PMAC2x, which produces 2n-bit outputs and bases on a tweakable block cipher. Figure 1 provides a schematic illustration. Our MAC differs from PMAC_TBC1k mainly in the fact of the extended output, and minorly in the point that we add support for inputs whose length is not a multiple of n. As our second contribution, we briefly revisit the analysis by Naito on PMAC_TBC1k and show that we can easily adapt our proof for PMAC2x and derive a secure variant that we call PMACx which XORs both its outputs and produces only n-bit tags. Table 1 compares our constructions to earlier parallelizable BBB-secure MACs. As our third contribution, we combine PMAC2x with the purely IV-based variant of Counter-in-Tweak to a single-primitive, single-key deterministic authenticated encryption scheme, which we call SIVx, and which provides BBB-security without assumptions about nonces.

Table 1. Previous parallel BBB-secure MACs. (T)BC \(=\) (tweakable) block cipher, q \(=\) max. #queries, m \(=\) max. #blocks per query, \(\ell \) \(=\) max. total #blocks.

Earlier Parallelizable MACs. A considerable amount of works considered parallel MACs; parallel XOR-MACs have already been introduced in 1995 by Bellare et al. [2]; their constructions fed the message blocks together with a counter into a primitive to obtain stateful and randomized MACs. Bernstein [5] published the Protected Counter Sum (PCS), which transformed an XOR-MAC with an independent PRF into a stateless deterministic MAC. PMAC was described by Black and Rogaway first in [6], and was slightly modified to PMAC1 in [23]. Since then, the security of PMAC has been rigorously studied in various works [12, 14, 16, 18, 19]. The first BBB-secure parallelizable MAC was proposed by Yasuda [25]; His PMAC\(^+\) construction is a three-key version of PMAC which possesses two n-bit state values, which are processed by two independently keyed PRPs, and are XORed to produce the tag. Datta et al. [7] derived a single-key version thereof, called 1k_PMAC\(^+\). While those are rate-1 designs with larger internal state, there also exist slightly less efficient proposals with smaller state. Yasuda [26] introduced PMAC with parity (PMAC/P), which processes each sequence of r consecutive message blocks in PMAC-like manner, but inserts the XOR sum of those r blocks as an additional block. Zhang’s PMACX construction [27] generalized PMAC/P by multiplying the input with an MDS matrix before authentication. In a similar direction goes LightMAC [13], a lightweight variant similar to Bernstein’s PCS. However, the security guarantees of all earlier parallelizable MACs in this paragraph are far from the optimal PRF bound.

Fig. 1.
figure 1

Processing an m-block message with a partial final block in PMAC2x. \(\widetilde{E}: \{0,1\}^{k} \times \{0,1\}^{d} \times \{0,1\}^{t} \times \{0,1\}^{n} \rightarrow \{0,1\}^{n}\) denotes a tweakable block cipher, \(\textsc {Conv}: \{0,1\}^{n} \rightarrow \{0,1\}^{t}\) a regular function, and \(\odot \) multiplication in Galois-Field \(\mathbb {GF}(2^n)\).

2 Preliminaries

General Notation. We use lowercase letters xy for indices and integers, uppercase letters XY for binary strings and functions, calligraphic uppercase letters \(\mathcal {X}, \mathcal {Y} \) for sets; \(X \,\Vert \, Y\) for the concatenation of binary strings X and Y, and \(X \oplus Y\) for their bitwise XOR. We indicate the length of X in bits by |X|, and write \(X_i\) for the i-th block, X[i] for the i-th most-significant bit of X, and X[i..j] for the bit sequence \(X[i],\ldots ,X[j]\). We denote by \(X \twoheadleftarrow \mathcal {X} \) that X is chosen uniformly at random from the set \(\mathcal {X} \). We define \(\mathsf {Func} (\mathcal {X}, \mathcal {Y})\) for the set of all functions \(F: \mathcal {X} \rightarrow \mathcal {Y} \), \(\mathsf {Perm} (\mathcal {X})\) for the set of all permutations \(\pi : \mathcal {X} \rightarrow \mathcal {X} \), and \(\mathsf {\widetilde{Perm}} (\mathcal {T}, \mathcal {X})\) for the set of tweaked permutations over \(\mathcal {X} \) with tweak space \(\mathcal {T} \). We define by \(X_1, \ldots , X_j \xleftarrow {x} X\) an injective splitting of a string X into blocks of x-bit such that \(X = X_1 \,\Vert \, \cdots \,\Vert \, X_j\), \(|X_i| = x\) for \(1 \le i \le j-1\), and \(|X_j| \le x\). For two sets \(\mathcal {X} \) and \(\mathcal {Y} \), let \(\mathcal {X} \xleftarrow {\cup } \mathcal {Y} \) denote \(\mathcal {X} \leftarrow \mathcal {X} \cup \mathcal {Y} \). A uniform random function \(\rho : \mathcal {X} \rightarrow \mathcal {Y} \) is a random variable uniformly distributed over \(\mathsf {Func} (\mathcal {X}, \mathcal {Y})\). Given a function \(F: \mathcal {X} \rightarrow \mathcal {Y} \), we write \(\textsf {domain} (F)\) for the set of all inputs \(X \in \mathcal {X} \) to F that occurred before (i.e., excluding) the current query; similarly, we write \(\textsf {range} (F)\) for the set of all outputs \(Y \in \mathcal {Y} \) that occurred before the current query. We borrow the notation for a restriction on a set from [8]: let \(\mathcal {Q} \subseteq (\mathcal {X} \times \mathcal {Y} \times \mathcal {Z})^*\), then we denote by \(\mathcal {Q} _{|Y, Z} = \{ (Y, Z) \,|\, \exists X: (X, Y, Z) \in \mathcal {Q} \}\) the restriction of \(\mathcal {Q} \) to values \(Y \in \mathcal {Y} \) and \(Z \in \mathcal {Z} \). This generalizes in the obvious way.

For an event E, we denote by \(\Pr [E]\) the probability of E. We write \(\langle x \rangle _{n} \) for the binary representation of an integer x as an n-bit string, or short \(\langle x \rangle \) if n is clear from the context, in big-endian manner, e. g., \(\langle 1 \rangle _4\) would be encoded to (0001).

Adversaries. An adversary \(\mathbf {A} \) is an efficient Turing machine that interacts with a given set of oracles that appear as black boxes to \(\mathbf {A}\). We denote by \(\mathbf {A} ^{\mathcal {O} }\) the output of \(\mathbf {A}\) after interacting with some oracle \(\mathcal {O} \). We write for the advantage of \(\mathbf {A}\) to distinguish between oracles \(\mathcal {O} ^1\) and \(\mathcal {O} ^2\). All probabilities are defined over the random coins of the oracles and those of the adversary, if any. We write \(\mathbf {{Adv}}^{X}_{F}(q, \ell , \theta ) := \max _{\mathbf {A}}\{\mathbf {{Adv}}^{X}_{F}(\mathbf {A})\}\) for the maximal advantage over all X-adversaries \(\mathbf {A}\) on F that run in time at most \(\theta \) and pose at most q queries of at most \(\ell \) blocks in total to its oracles. Wlog., we assume that \(\mathbf {A}\) never asks queries to which it already knows the answer.

We will provide pseudocode descriptions of the oracles, which will be referred to as games, according to the game-playing framework by Bellare and Rogaway [3]. Each game consists of a set of procedures. We define \(\Pr [G(\mathbf {A}) \Rightarrow x]\) as the probability that the game G outputs x when given \(\mathbf {A}\) as input.

Definition 1 (TPRP Advantage)

Let \(\widetilde{E}: \mathcal {K} \times \mathcal {T} \times \mathcal {X} \rightarrow \mathcal {X} \) be a tweakable block cipher with non-empty key space \(\mathcal {K}\) and tweak space \(\mathcal {T} \). Let \(\mathbf {A}\) a computationally bounded adversary with access to an oracle, where \(K \twoheadleftarrow \mathcal {K} \) and \(\widetilde{\pi } \twoheadleftarrow \mathsf {\widetilde{Perm}} (\mathcal {T}, \mathcal {X})\). Then, the TPRP advantage of \(\mathbf {A}\) on \(\widetilde{E} \) is defined as .

A MAC is a tuple of functions \(F: \mathcal {K} \times \mathcal {X} \rightarrow \mathcal {Y} \) with non-empty key space \(\mathcal {K}\), and a generic verification function \(\textsc {Verify}: \mathcal {K} \times \mathcal {X} \times \mathcal {Y} \rightarrow \{1, \bot \}\), where for all \(K \in \mathcal {K} \) and \(X \in \mathcal {X} \), \(\textsc {Verify} _K(X, Y)\) returns 1 iff \(F_K(X) = Y\) and \(\bot \) otherwise. We use \(\bot \), when in place of an oracle, to always return the invalid symbol \(\bot \). It is well-known that if F is a secure PRF, it is also a secure MAC; however, the converse statement is not necessarily true.

Definition 2 (PRF Advantage)

Let \(F: \mathcal {K} \times \mathcal {X} \rightarrow \mathcal {Y} \) be a function with non-empty key space \(\mathcal {K}\), and \(\mathbf {A}\) a computationally bounded adversary with access to an oracle, where \(K \twoheadleftarrow \mathcal {K} \) and \(\rho \twoheadleftarrow \mathsf {Func} (\mathcal {X}, \mathcal {Y})\). Then, the PRF advantage of \(\mathbf {A}\) on F is defined as .

3 Definition of PMAC2x and PHASHx

figure a

This section defines the generic PMAC2x construction and its underlying hash function PHASHx. Fix integers kntd, with \(d \ge 2\). Let \(\mathcal {K} = \{0,1\}^{k}\) and \(\mathcal {T} = \{0,1\}^{t}\) be non-empty sets of keys and tweaks, respectively. Moreover, derive a set of domains \(\mathcal {D} := \{0,1,2,3\} = \{0,1\}^{d}\) which are encoded as their respective d-bit values, and a domain-tweak set \(\mathcal {T} ' := \mathcal {D} \times \mathcal {T} \). Let \(\mathcal {M} \subseteq (\{0,1\}^{n})^{*}\) denote an input space. Further, let \(\widetilde{E}: \mathcal {K} \times \mathcal {T} ' \times \{0,1\}^{n} \rightarrow \{0,1\}^{n}\) denote a tweakable block cipher. We will often write \(\widetilde{E} _{K}^{D, T}(\cdot )\) as short form of \(\widetilde{E} (K, D, T, \cdot )\). \(K \in \mathcal {K} \), \(D \in \mathcal {D} \), and \(T \in \mathcal {T} \) denote key, domain, and tweak, respectively. \(\textsc {Conv}: \{0,1\}^{n} \rightarrow \{0,1\}^{t}\) be a regular functionFootnote 2 which is used to convert the outputs of PHASHx, \(X_m\) and \(Y_m\), so they can be used as tweaks of \(\widetilde{E} \) in the finalization step. We denote by \(\textsc {PMAC2x} [\widetilde{E} ]\) and \(\textsc {PHASHx} [\widetilde{E} ]\) the instantiation of PMAC2x and PHASHx with \(\widetilde{E} \). Both are defined, with a default instantiation of \(\textsc {Conv} \), in Algorithm 1.

4 Security Analysis of PMAC2x

Theorem 1

Let \(\widetilde{E} \) and \(\textsc {PMAC2x} [\widetilde{E} ]\) be defined as in Sect. 3. Let \(d + t = n\), and let \(m < 2^{t}\) denote the maximum number of n-bit blocks of any query. Then

$$\begin{aligned} \mathbf {{Adv}}^{\textsc {PRF}}_{\textsc {PMAC2x} [\widetilde{E} ]}(q, \ell , \theta ) \le&\frac{2^{2d}q^2}{2 \cdot (2^n - q)^2} + \frac{2^{d}q^3}{3 \cdot 2^{2n}(2^n - q)} + \frac{2^{d}q^2}{2^{n}(2^n - q)} \\&+\, \mathbf {{Adv}}^{\textsc {TPRP}}_{\widetilde{E}}(\ell + 2q, O(\theta + \ell + 2q)). \end{aligned}$$

The final term results from a standard argument after replacing the tweakable block cipher \(\widetilde{E} \) by a random tweakable permutation \(\widetilde{\pi } \twoheadleftarrow \mathsf {\widetilde{Perm}} (\mathcal {T} ', \{0,1\}^{n})\).

figure b

Let \(\mathbf {A}\) be an adversary that makes at most q queries of at most m blocks each and of at most \(\ell \) blocks in total. We assume, \(\mathbf {A}\) does not ask duplicate queries and has the goal to distinguish between a \(\textsc {PMAC2x} [\widetilde{\pi } ]\) oracle with an internally sampled tweaked permutations \(\widetilde{\pi } \) and a random function \(\rho : \{0,1\}^{*} \rightarrow \{0,1\}^{2n}\).

We consider the game described in Algorithm 2. The game without the boxed statements coincides with a random function \(\rho \), whereas the game with them exactly represents \(\textsc {PMAC2x} [\widetilde{\pi } ]\), performing lazy sampling for the permutations \(\widetilde{\pi } ^{2, {\widehat{Y}_{m}}}(\cdot )\) and \(\widetilde{\pi } ^{3, \widehat{X}_m}(\cdot )\), for all \(\widehat{X}_m, {\widehat{Y}_{m}} \in \{0,1\}^{t}\). Both algorithms differ only when bad events occur. Hence, by the fundamental lemma of game playing [4], it holds

$$\begin{aligned} \Pr [ \mathbf {A} ^{\textsc {PMAC2x} [\widetilde{\pi } ](\cdot )} \Rightarrow 1 ] - \Pr [ \mathbf {A} ^{\rho (\cdot )} \Rightarrow 1 ]&\le \Pr [ \mathbf {A} \text { sets } \textsf {bad} ]. \end{aligned}$$

In the remainder, we consider five cases which cover all possibilities:

  • Case1: \((\widehat{X}_m, {\widehat{Y}_{m}}) \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}}\).

  • Case2: \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}} \wedge U \in \textsf {range} (\widetilde{\pi } ^{2,{\widehat{Y}_{m}}}) \wedge V \in \textsf {range} (\widetilde{\pi } ^{3,\widehat{X}_m})\).

  • Case3: \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}} \wedge U \in \textsf {range} (\widetilde{\pi } ^{2,{\widehat{Y}_{m}}}) \wedge V \not \in \textsf {range} (\widetilde{\pi } ^{3,\widehat{X}_m})\).

  • Case4: \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}} \wedge U \not \in \textsf {range} (\widetilde{\pi } ^{2,{\widehat{Y}_{m}}}) \wedge V \in \textsf {range} (\widetilde{\pi } ^{3,\widehat{X}_m})\).

  • Case5: \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}} \wedge U \not \in \textsf {range} (\widetilde{\pi } ^{2,{\widehat{Y}_{m}}}) \wedge V \not \in \textsf {range} (\widetilde{\pi } ^{3,\widehat{X}_m})\).

We list Case 5 only for the sake of completeness. It is easy to see that in Case 5, the output of the game is indistinguishable between the worlds. We use \(M^i\), \(\widehat{X}_m ^i\), \({\widehat{Y}_{m}} ^i\), \(U^i\), \(V^i\) to refer to the respective values of the i-th query, where \(i \in [1, q]\), and assume it is the current query of the adversary. Additionally, we will use indices j and k, where \(j, k \in [1, i-1]\), to refer to previous queries.

Case1 : For this case, we revisit the fact that for two fixed disjoint queries \(M^i\) and \(M^j\), the values \(X_m\) and \(Y_m\) are results of two random variables. A variant of the proof is given e.g. in [17, Sect. 3.3], and revisited in the following only for the sake of completeness. Fix query indices \(i \in [1, q]\) and \(j \in [1, i-1]\). Let m and \(m'\) denote the number of blocks of the i-th and j-th query, respectively; moreover, let \(X_m^i\), \(Y_m^i\) denote the values \(X_m\) and \(Y_m\) of the i-th query and \(X_{m'}^j\), \(Y_{m'}^j\) those of the j-th query, respectively. \(X_m^i\), \(Y_m^i\), \(X_m^j\), and \(Y_m^j\) result from:

$$\begin{aligned} X_m^i&= C_1^i \oplus C_2^i \oplus \cdots \oplus C_m^i \qquad&Y_m^i&= 2^{m} C_1^i \oplus 2^{m-1} C_2^i \oplus \cdots \oplus 2 \cdot C_m^i, \\ X_{m'}^i&= C_1^j \oplus C_2^j \oplus \cdots \oplus C_{m'}^j \qquad&Y_{m'}^j&= 2^{m'} C_1^j \oplus 2^{m'-1} C_2^j \oplus \cdots \oplus 2 \cdot C_{m'}^j. \end{aligned}$$

So, we want to bound the probability for the following equational system:

$$\begin{aligned} C_1^i \oplus C_2^i \oplus \cdots \oplus C_m^i&= C_1^j \oplus C_2^j \oplus \cdots \oplus C_{m'}^j \\ 2^{m} C_1^i \oplus 2^{m-1} C_2^i \oplus \cdots \oplus 2 \cdot C_m^i&= 2^{m'} C_1^j \oplus 2^{m'-1} C_2^j \oplus \cdots \oplus 2 \cdot C_{m'}^j. \end{aligned}$$

There exist three distinct subcases which cover all possibilities:

  • Subcase 1: \(\mathbf {m \ne m'}\) . In this case, the equations above provide a unique solution set for two random variables; thus, the probability that the equations hold for two fixed queries is upper bounded by \(1/(2^n - (i-1))^2\).

  • Subcase 2: \(\mathbf {m = m'}\) and there exists \(\mathbf {\varvec{\alpha } \in [1, m]}\) s.t. \(\mathbf {C^i_{\varvec{\alpha }} \ne C^j_{\varvec{\alpha }}}\) and for all \(\mathbf {\varvec{\beta } \in [1, m]}\) with \(\mathbf {\varvec{\beta } \ne \varvec{\alpha } {:}\; C^i_{\varvec{\beta }} = C^j_{\varvec{\beta }}}\) . In this case, both messages share only a single different output block. Thus, the equations above never hold.

  • Subcase 3: \(\mathbf {m = m'}\) and there exist distinct \(\mathbf {\varvec{\beta } \in [1, m]}\) with \(\mathbf {\varvec{\beta } \ne \varvec{\alpha } {:}}\) \({C^i_{\varvec{\beta }} = C^j_{\varvec{\beta }}}\) . In this case, both messages share only a single different output block. Thus, the equations above can never hold.

So, the probability for two fixed disjoint queries \(M^i\) and \(M^j\) that \((X_m^i, Y_m^i) = (X_{m'}^j, Y_{m'}^j)\) holds, is bounded by \(1/(2^n - q)^2\). Since \(\widehat{X}_m ^i\) and \({\widehat{Y}_{m}} ^i\) are derived from \(X_m^i\) and \(Y_m^i\), respectively by a regular function (and so are \(\widehat{X}_m ^j\) and \({\widehat{Y}_{m}} ^j\) derived from \(X_{m'}^j\) and \(Y_{m'}^j\), respectively), it follows that the probability of \((\widehat{X}_m ^i, {\widehat{Y}_{m}} ^i) = ({{{\widehat{X}}}_{m'}} ^j, {{{\widehat{Y}}}_{m'}} ^j)\) to hold for the i-th and j-th query, with \(j \in [1, i-1]\), is at most

$$\begin{aligned} (i-1) \cdot \frac{2^{d}}{(2^n - q)} \cdot \frac{2^{d}}{(2^n - q)}&= \frac{2^{2d}(i - 1)}{(2^n - q)^2}. \end{aligned}$$

Case2 : In this case, there exists some previous query \(({{{\widehat{X}}}_{m'}} ^j, {{{\widehat{Y}}}_{m'}} ^j, U^j, V^j)\) s.t. \(U = U^j \wedge {\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j\), and a distinct previous query \(({{{\widehat{X}}}_{m''}} ^k, {{{\widehat{Y}}}_{m''}} ^k, U^k, V^k)\) s.t. \(V = V^k \wedge \widehat{X}_m = {{{\widehat{X}}}_{m''}} ^k\). From our assumption \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}}\), it follows that \(j \ne k\); otherwise, the current query would have stepped into Case1 instead. We can bound the probability by

$$\begin{aligned}&\Pr \left[ (U = U^j \wedge {\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j) \wedge (V = V^k \wedge \widehat{X}_m = {{{\widehat{X}}}_{m''}} ^k) \right] \\ \le&\Pr \left[ U = U^j \wedge {\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j \wedge V = V^k \left| \right. \widehat{X}_m = {{{\widehat{X}}}_{m''}} ^k \right] . \end{aligned}$$

U and V are chosen independently and uniformly at random from \(\{0,1\}^{n}\) each, and can collide with at most \(i-1\) previous values \(U^j\) and at most \(i-1\) previous values \(V^k\), respectively. For fixed j and k, the probability for U to collide with \(U^j\) is upper bounded by \(1/2^n\), and independently, the probability for V to collide with \(V^k\) is also \(1/2^n\). Since the game chooses U and V independently from \(Y_m\), the probability that \({\widehat{Y}_{m}} \) collides with \({{{\widehat{Y}}}_{m'}} ^j\) is at most \(2^d/(2^n - q)\) since we assumed that the adversary poses no duplicate queries, and therefore, \(Y_m\) and \(Y_{m'}^j\) are results of two random variables. Since the collision of \(U = U^j\) already fixes the colliding query pair, there is no additional factor \((i-1)\) for the choice of which pairs of \({\widehat{Y}_{m}} \) and \({{{\widehat{Y}}}_{m'}} ^j\) to collide. It follows that the probability for this case to occur at the i-th query is upper bounded by

$$\begin{aligned} \frac{i-1}{2^n} \cdot \frac{i-2}{2^n} \cdot \frac{2^d}{2^n - q} \le \frac{2^d (i-1)^2}{2^{2n}(2^n - q)}. \end{aligned}$$

Case3 : In this case, there exists some previous query \(({{{\widehat{X}}}_{m'}} ^j, {{{\widehat{Y}}}_{m'}} ^j, U^j, V^j)\) s.t. \(U = U^j \wedge {\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j\). From our assumption \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}}\) and \({\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j\) follows that \(\widehat{X}_m \ne {{{\widehat{X}}}_{m'}} ^j\) holds, like in Case2. We can bound the probability by

$$\begin{aligned}&\Pr \left[ U = U^j \wedge {\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j \wedge V \not \in \textsf {range} (\widetilde{\pi } ^{3,\widehat{X}_m}) \right] \\ \le&\Pr \left[ U = U^j \wedge {\widehat{Y}_{m}} = {{{\widehat{Y}}}_{m'}} ^j \left| \right. V \not \in \textsf {range} (\widetilde{\pi } ^{3,\widehat{X}_m}) \right] . \end{aligned}$$

For a fixed j-th query, the probability that \({\widehat{Y}_{m}} \) collides with \({{{\widehat{Y}}}_{m'}} ^j\) is at most \(2^d/(2^n - q)\). Since U is chosen uniformly at random from \(\{0,1\}^{n}\) and independently from \(Y_m\), U can collide with \(U^j\) with probability \(1/2^n\). So, the probability of this case to occur for the i-th query can be upper bounded by

$$\begin{aligned} \frac{2^d}{2^n - q} \cdot \frac{i-1}{2^n} = \frac{2^d (i-1)}{2^n(2^n - q)}. \end{aligned}$$

Case4 : In this case, it holds that \(V = V^j \wedge \widehat{X}_m = {{{\widehat{X}}}_{m'}} ^j\). From \((\widehat{X}_m, {\widehat{Y}_{m}}) \not \in \mathcal {Q} _{|\widehat{X}_m, {\widehat{Y}_{m}}}\) and \(\widehat{X}_m = {{{\widehat{X}}}_{m'}} ^j\) follows here that \({\widehat{Y}_{m}} \ne {{{\widehat{Y}}}_{m'}} ^j\) holds, analogously to Case2 and Case3. We can bound the probability by

$$\begin{aligned}&\Pr \left[ V = V^j \wedge \widehat{X}_m = {{{\widehat{X}}}_{m'}} ^j \wedge U \not \in \textsf {range} (\widetilde{\pi } ^{2,{\widehat{Y}_{m}}}) \right] \\ \le&\Pr \left[ V = V^j \wedge \widehat{X}_m = {{{\widehat{X}}}_{m'}} ^j \left| \right. U \not \in \textsf {range} (\widetilde{\pi } ^{2,{\widehat{Y}_{m}}}) \right] . \end{aligned}$$

Obviously, this case can be handled similarly as Case3. For a fixed j-th query, the probability that \(\widehat{X}_m \) collides with \({{{\widehat{X}}}_{m'}} ^j\) is at most \(2^d/(2^n - q)\). Since V is chosen uniformly at random from \(\{0,1\}^{n}\) and independently from \(X_m\), V can collide with \(V^j\) with probability \(1/2^n\). So, the probability of this case to occur for the i-th query can also be upper bounded by

$$\begin{aligned} \frac{2^d (i-1)}{2^n(2^n - q)}. \end{aligned}$$

Taking the terms from all cases and the union bound over at most q queries gives

$$\begin{aligned} \Pr \left[ \mathbf {A} \text { sets } \textsf {bad} \right]&\le \sum _{i = 1}^{q} \left( \frac{2^{2d}(i-1)}{(2^n - q)^2} + \frac{2^{d}(i-1)^2}{2^{2n}(2^n - q)} + 2 \cdot \frac{2^{d}(i-1)}{2^{n}(2^n - q)} \right) \\&\le \frac{2^{2d}q^2}{2 \cdot (2^n - q)^2} + \frac{2^{d}q^3}{3 \cdot 2^{2n}(2^n - q)} + \frac{2^{d}q^2}{2^{n}(2^n - q)}. \end{aligned}$$

5 Security Analysis of PMACx

This section considers a variant of PMAC2x, PMACx, that adds a final XOR to produce only an n-bit tag, following the design of PMAC_TBC1k. A schematic illustration is given in Fig. 2. We revisit the assumption by Naito, and show that our proof of PMAC2x needs only a slight adaption for PMACx.

Previous Analysis. Theorem 2 in [17] proves the security of PMAC_TBC1k with the help of an analysis of multi-collisions of the final chaining values (\(X_m\) and \(Y_m\) in our notation). Note that our notation differs from [17] to be consistent to our previous section. Define two monotone events \(\textsf {mcoll} _1\) and \(\textsf {mcoll} _2\). Let \(\rho \) and \(\xi \) denote positive integers and define three sets \(\mathcal {X} \), \(\mathcal {Y} \), and \(\mathcal {Q} \) which store the values \(\widehat{X}_m ^i\), \({\widehat{Y}_{m}} ^i\), and the tuples \((X_m^i, {\widehat{Y}_{m}} ^i)\), respectively, of the queries \(1 \le i \le q\).

$$\begin{aligned} \textsf {mcoll} _1 \,{:=}&(\exists \, \widehat{X}_m ^1, \ldots , \widehat{X}_m ^{\rho } \in \mathcal {X} \text { s.t. } \widehat{X}_m ^1 = \ldots = \widehat{X}_m ^{\rho }) \\&\vee \, (\exists \, {\widehat{Y}_{m}} ^1, \ldots , {\widehat{Y}_{m}} ^{\rho } \in \mathcal {Y} \text { s.t. } {\widehat{Y}_{m}} ^1 = \ldots = {\widehat{Y}_{m}} ^{\rho }), \\ \textsf {mcoll} _2 \,{:=}&\exists \, (X_m^1, {\widehat{Y}_{m}} ^1), \ldots , (X_m^{\xi }, {\widehat{Y}_{m}} ^{\xi }) \in \mathcal {Q} \text { s.t. } (X_m^1, {\widehat{Y}_{m}} ^1) = \ldots = (X_m^{\xi }, {\widehat{Y}_{m}} ^{\xi }). \end{aligned}$$

The original proof further defined a monotone compound event \(\textsf {mcoll}:= \textsf {mcoll} _1 \vee \textsf {mcoll} _2\) and used the fact that

$$\begin{aligned} \Pr \left[ \mathbf {A} \text { sets } \textsf {bad} \right]&= \Pr \left[ \mathbf {A} \text { sets } \textsf {bad} \wedge \textsf {mcoll} \right] + \Pr \left[ \mathbf {A} \text { sets } \textsf {bad} \wedge \lnot \textsf {mcoll} \right] \\&\le \Pr \left[ \textsf {mcoll} _1 \right] + \Pr \left[ \textsf {mcoll} _2 \right] + \Pr \left[ \mathbf {A} \text { sets } \textsf {bad} | \lnot \textsf {mcoll} \right] . \end{aligned}$$

The analysis in [17] bounds \(\Pr [\textsf {mcoll} _1]\) as

$$\begin{aligned} \Pr [\textsf {mcoll} _1]&\le 2 \cdot 2^{t} \cdot \left( {\begin{array}{c}q\\ \rho \end{array}}\right) \cdot \left( \frac{2^{n-t}}{2^n - q}\right) ^{\rho } \le 2^{t+1} \cdot \left( \frac{2^{n-t} \cdot e q}{\rho (2^n - q)}\right) ^{\rho }, \end{aligned}$$

using Stirling’s approximation \(x! \ge (x / e)^x\) for any x. Note, in PMAC_TBC1k, the domain size in PMAC2x is fixed to \(d = 2\) bits. The bound above consists of the probability that \(\rho \) values are all equal, \((2^{n-t}/(2^n - q))^{\rho }\); the fact that there are \(2^{t}\) tweak values; and the \(\left( {\begin{array}{c}q\\ \rho \end{array}}\right) \) possible ways to choose \(\rho \) out of q values. However, the bound holds only if the \(\rho \) colliding tweaks stem from \(\rho \) linearly independent random variables, which is not necessarily the case. Imagine a sequence of \(2^m\) queries which combine pair-wise distinct blocks \(\{M_i, M'_i\}\) with \(M_i \ne M'_i\), for \(1 \le i \le m\) position-wise, i. e., we have \(2^m\) queries of m blocks each: \(Q^{ 0} = (M_1, M_2, M_3, \ldots , M_m)\), \(Q^{ 1} = (M'_1, M_2, M_3, \ldots , M_m)\), \(Q^{ 2} = (M_1, M'_2, M_3, \ldots , M_m)\), ..., \(Q^{2^{m}-1} = (M'_1, M'_2, M'_3, \ldots , M'_m)\). When used as queries to PMAC_TBC1k, the \(2^m\) resulting values \(X^{i}_m\), for \(0 \le i \le 2^m-1\), depend on the linear combination of only 2m random variables. A similar argument holds for the values \(Y^i_m\), as well as for the similarly treated bound of \(\textsf {mcoll} _2\). Thus, the multi-collision bound demands a significantly more detailed analysis.

Fig. 2.
figure 2

PMACx, the variant of PMAC2x with n-bit output, following the design of PMAC_TBC1k.

Fixing the Analysis. From our proof for PMAC2x, we can now derive a corollary for a similar security bound for PMACx, which again can be easily transformed into a bound for PMAC_TBC1k.

Corollary 1

Let \(\widetilde{E} \) and \(\textsc {PMAC2x} [\widetilde{E} ]\) be defined as in Sect. 3. Let \(d+t = n\), and let \(m < 2^{t}\) denote the maximum number of n-bit blocks of any query. Then, it holds that \( \mathbf {{Adv}}^{\textsc {PRF}}_{\textsc {PMACx} [\widetilde{E} ]}(q, \ell , \theta ) \le \mathbf {{Adv}}^{\textsc {PRF}}_{\textsc {PMAC2x} [\widetilde{E} ]}(q, \ell , \theta ). \)

The proof can use a game almost identical to that in Algorithm 2, where we only modify Line 44 to return the XOR of U and V. This is shown in Algorithm 3. All further procedures and functions remain identical to those in Algorithm 2.

If \((U \,\Vert \, V)\) is indistinguishable from outputs of a 2n-bit random function \(\rho \), then each of the n-bit outputs U and V can be considered random. It follows, if U is indistinguishable from n-bit values, then the XOR sum of \(U \oplus V\) is also indistinguishable from a random n-bit value. Hence, the PRF advantage of \(\mathbf {A}\) on PMACx is upper bounded by that of an adversary \(\mathbf {A} '\) on PMAC2x with an equal amount of resources as \(\mathbf {A}\); hence, the corollary follows.

PMACx and PMAC_TBC1k differ in three aspects: (1) PMACx allows messages whose length is not a multiple of n bits by padding the final block with \(10^*\) and using a distinct tweak for it; (2) PMACx defines a generic d-bit domain encoding and defines a conversion function Conv for deriving the inputs for the finalization; and (3) PMACx adds a final doubling for a simpler and consistent description. Clearly, none of the differences affects the distribution of final chaining values \(\widehat{X}_m \) and \({\widehat{Y}_{m}} \). Hence, when fixing \(d = 2\), a security result for PMACx can be easily carried over to a bound for PMAC_TBC1k.

figure c

6 Definition and Security Analysis of SIVx

Next, we define the deterministic AE scheme SIVx, which combines PMAC2x and the IV-based Counter-in-Tweak mode IVCTRT. We recall the definitions of IV-based encryption and Deterministic AE security in the full version of this work [11]. Note, that it is straight-forward to derive a nonce-based AE scheme by fixing the nonce length and appending the nonce to the associated data.

Fig. 3.
figure 3

The deterministic AE scheme SIVx from the composition of \(\textsc {PMAC2x} [\widetilde{E} ]\) (top), and the \(\textsc {IVCTRT} [\widetilde{E} ]\) mode of encryption (bottom) [20]. The figure starts the message processing in PMAC2x from \(0^n\) only to prevent that \(X_0\) and \(Y_0\) cancel out.

IVCTRT. IVCTRT denotes the version of Counter in Tweak [21, Appendix C], which takes a 2n-bit random IV plus the message for each encryption. Let \(\mathcal {T} = \{0,1\}^{t}\), and \(\mathcal {T} ' = \{0,1\} \times \mathcal {T} \). The mode first splits \((U, V) \xleftarrow {n} IV\), and uses a given tweakable block cipher \(\widetilde{E}: \mathcal {K} \times \mathcal {T} ' \times \{0,1\}^{n} \rightarrow \{0,1\}^{n}\) in counter mode, with V as cipher input. Next, it derives \(T \leftarrow \textsc {Conv} '(U)\) from U with a regular function \(\textsc {Conv} ': \{0,1\}^{n} \rightarrow \mathcal {T} \) and increments T for every call to \(\widetilde{E} \) using addition modulo \(2^t\). We denote by \(\textsc {IVCTRT} [\widetilde{E} ]\) the instantiation of \(\textsc {IVCTRT} \) with \(\widetilde{E} \); from Theorem 1 and Appendix C in [21], we recall the following theorem:

Theorem 2

(ivE Security of IVCTRT ). Let \(\widetilde{\pi } \twoheadleftarrow \mathsf {\widetilde{Perm}} (\mathcal {T} ', \{0,1\}^{n})\) be an ideal tweakable block cipher. Let \(\mathbf {A}\) be an adversary which asks at most q queries of at most \(8 \le \ell \le |\mathcal {T} |\) blocks in total. Then

$$\begin{aligned} \mathbf {{Adv}}^{\textsc {ivE}}_{\textsc {IVCTRT} [\widetilde{\pi } ]}(\mathbf {A})&\le \frac{1}{2^n} + \frac{1}{|\mathcal {T} |} + \frac{4\ell \log q}{|\mathcal {T} |} + \frac{\ell \log ^2(\ell )}{2^n}. \end{aligned}$$
figure d

Definition of SIVx. We define the deterministic AE scheme \(\textsc {SIVx} [\widetilde{E} ]\) as the composition of \(\textsc {PMAC2x} [\widetilde{E} ]\) and \(\textsc {IVCTRT} [\widetilde{E} ]\), as given in Algorithm 4. A schematic illustration of the encryption process is depicted in Fig. 3. In general, we denote by \(\textsc {SIVx} [F, \varPi ]\) the instantiation of \(\textsc {SIVx} \) with a function F and an IV-based encryption scheme \(\varPi \) in SIVx. To use the same key in all calls to \(\widetilde{E} \), we parametrize PHASHx to separate the domains. We use the domains \(2 = (0010)_2\) and \(3 = (0011)_2\) for the finalization steps, \(4 = (0100)_2\) and \(5 = (0101)_2\) for processing the associated data, as well as \(6 = (0110)_2\) and \(7 = (0111)_2\) for processing the message in PMAC2x. We encode them into the \(d = 4\) most significant bits of the tweak. Inside IVCTRT, however, we use the single-bit domain 1 in all calls to \(\widetilde{E} \) for we lose only a single bit from the IV. For concreteness, we define the initial values in PMAC2x as \(X_0 = Y_0 = 0^n\).

Theorem 3

(DAE Security of SIVx). Let \(F: \mathcal {K} _1 \times \mathcal {A} \times \mathcal {M} \rightarrow \{0,1\}^{2n}\), and let \(\varPi = (\widetilde{\mathcal {E}}, \widetilde{\mathcal {D}})\) be an IV-based encryption scheme with key space \(\mathcal {K} _2\) and IV space \(\mathcal {IV} \). Let \(K_1 \twoheadleftarrow \mathcal {K} _1\) and \(K_2 \twoheadleftarrow \mathcal {K} _2\) be independent. Let \(\textsc {Conv} ': \{0,1\}^{n} \rightarrow \mathcal {IV} \) be a regular function. Let \(\mathbf {A}\) be a DAE adversary running in time at most \(\theta \), asking at most q queries of at most \(8 \le \ell < 2^{t}\) blocks in total. Then, it holds that

$$\begin{aligned} \mathbf {{Adv}}^{\textsc {DAE}}_{\textsc {SIVx} [F, \varPi ]}(\mathbf {A})&\le \mathbf {{Adv}}^{\textsc {ivE}}_{\varPi }(\theta + O(\ell ), q, \ell ) + \mathbf {{Adv}}^{\textsc {PRF}}_{F}(\theta + O(\ell ), q, \ell ) + \frac{q}{2^n}. \end{aligned}$$

We defer the proof of Theorem 3 to the full version of this work [11]. Inserting the bounds from Theorems 1 and 2, we obtain the corollary below, where F denotes \(\textsc {PMAC2x} [\widetilde{E} ]\) and \(\varPi \) represents \(\textsc {IVCTRT} [\widetilde{E} ]\).

Corollary 2

Fix positive integers knt and \(d = 4\). Define \(d + t = n\) and let \(\mathcal {T} = \{0,1\}^{t}\) and \(\mathcal {T} ' = \{0,1\}^{d} \times \{0,1\}^{t}\), and \(\mathcal {IV} = \{0,1\}^{n-1}\). Let \(\widetilde{E}: \mathcal {K} \times \mathcal {T} ' \times \{0,1\}^{n} \rightarrow \{0,1\}^{n}\), and \(\textsc {Conv} ': \{0,1\}^{n} \rightarrow \mathcal {IV} \) be a regular function. Let \(K \twoheadleftarrow \mathcal {K} \) and \(\mathbf {A}\) be a DAE adversary that runs in time at most \(\theta \), and asks at most q queries of at most \(8 \le \ell < 2^t\) blocks in total. Then

$$\begin{aligned} \mathbf {{Adv}}^{\textsc {DAE}}_{\textsc {SIVx} [\widetilde{E} ]}(\mathbf {A}) \le&\frac{2^{2d}q^2}{2 \cdot (2^n - q)^2} + \frac{2^{d}q^3}{3 \cdot 2^{2n}(2^n - q)} + \frac{2^{d}q^2}{2^{n}(2^n - q)} + \frac{4\ell \log q + 1}{2^{n-1}} + \\&\frac{q+1+\ell \log ^2(\ell )}{2^n} + \mathbf {{Adv}}^{\textsc {TPRP}}_{\widetilde{E}}(\theta + O(2\ell +2q), 2\ell + 2q). \end{aligned}$$

7 Conclusion

This work revisited the PMAC_TBC1k construction by Naito for constructing a MAC with beyond-birthday-bound (BBB) security and 2n-bit outputs, called PMAC2x. We identified a critical assumption in the previous analysis of PMAC_TBC1k and circumvented it by a new proof for PMAC2x; moreover, we could easily derive a proof for PMACx, a variant of our PMAC2x construction with n-bit outputs. So, we also provided a corrected bound for Naito’s construction. We obtained the positive result that all three constructions provide PRF security for up to \(O(q^2/2^{2n} + q^3/2^{3n})\) queries. With the help of PMAC2x, we constructed a BBB-secure AE scheme from a tweakable block cipher whose security is independent of nonces and which depends on a single primitive under a single key. We are aware that the 2n-bit tag of SIVx requires still as many bits to be transmitted as for the 2n-bit nonce-IV in SCT; future work could study how an appropriate truncation could reduce the transmission overhead while retaining BBB security.