Keywords

1 Introduction

An authenticated encryption (AE) algorithm aims to achieve message confidentiality and integrity (authentication). The majority of AE schemes nowadays process two additional inputs: associated data AD and a nonce. The associated data is a piece of information, such as a network packet header, that does not require the provision of confidentiality but it does require authentication. The nonce input to an AE (AD) scheme is also practically motivated. A nonce is a unique value that enables the simplification of randomized or stateful AE algorithm to a deterministic one. This removes the need for random values generation or storing a state across distinct encryptions. The formalisation of nonce-based authenticated encryption was introduced in 2002 by Rogaway [24].

Two of the most prominent AEAD schemes research, development and standardization efforts in recent years have been the CAESAR competition (2014–2018) [7] and the ongoing NIST lightweight cryptography standardization process (2018–). The CAESAR competition produced a portfolio of algorithms for recommended use in three categories: 1. Lightweight (resource constrained environments): Ascon [9] and ACORN [28]; 2. High-performance: AEGIS-128 [30] and OCB [19]; 3. Defense in depth (stronger security guarantees): COLM [1], Deoxys II [18], and MORUS [29]. The CAESAR winners come with various advantages over the present standards GCM (NIST SP 800-38D) and CCM (IEEE 802.11i, IPsec ESP and IKEv2) and are expected to be adopted by new applications and standards.

The main focus of the defense in depth CAESAR category is nonce misuse resistance (NMR), a security target motivated by attacks which exploit implementation flaws leading to a repeated use of the same nonce by an application. More precisely, authenticity preservation despite nonce misuse is stated as critical and a limited privacy damage from nonce misuse as desirable. Nonce-misuse resistant AE (MRAE) was introduced by Rogaway and Shrimpton [26]. Later works dealt with online nonce misuse-resistant AE (OAE) [11] and online AE(OAE2) [15]. Nonce misuse purportedly presents a greater risk for small devices, such as IoT, where nonces could repeat due to various memory or space constraints, or remote usage. Maintaining the correct use of nonces is also especially challenging in distributed systems where nodes and connections can fail at any time. Recent attacks have illustrated the severity of nonce misuse in practice. In 2016 Böck et al. [8] investigated the nonce misuse security of the AES-GCM [10] AE mode and managed to completely break the authenticity of those connections where servers repeated the nonce. Their Internet-wide survey identified 184 such HTTPS servers. They showed how this vulnerability can be utilized to inject seemingly valid content into encrypted sessions. The next year Vanhoef and Piesens [27] introduced the key reinstallation attack which forces nonce repetitions and breaks the WPA2 wireless protocol.

The MRAE [25] and the RAE [14] security notions capture the “best possible” security of nonce based AEAD (the latter for an extended syntax) in face of nonce repetitions but unfortunately can not be satisfied by any online AEAD scheme. These are AE schemes which parse the plaintext as a sequence of smaller, usually fixed-size blocks during encryption, and produce the ciphertext as a sequence of such blocks, such that \( {i}^{\text {th}} \) ciphertext block can be immediately computed after having seen the first i plaintext blocks. Importantly, online encryption is better fitted for lightweight applications, where it is often critical to compute the ciphertext blocks on the fly with constant latency (e.g. streaming), or where a constant memory implementation is required. Such constraints can be found most recently in the consumer grade IoT applications, which come with stringent cost constraints and often get inadequate security as a consequence [20, 21], and which would greatly benefit from a lightweight AEAD scheme with robust security.

In 2012 Fleischmann et al. [11] proposed a weaker NMR security notion called OAE for online schemes that retains the same level of integrity and a well-defined, albeit lower level of confidentiality in face of nonce misuse. Hoang et al. [15] pointed out inconsistencies in this definition, reformulated it, demonstrated that in certain situations the confidentiality afforded by OAE secure schemes is not sufficient, and critiqued several aspects of the OAE definition. Addressing the latter, Hoang et al. proposed an alternative to OAE dubbed OAE2 which is a stronger security with more versatile syntax. However, the change in syntax also means that the notion is inapplicable to designs adhering to the most common nonce-based AEAD syntax, and as shown by Hoang et al., the possible exploits of the decay in confidentiality are unavoidable in any kind of online schemes, OAE2-secure ones included. Despite this controversy, OAE can afford pragmatic protection against nonce-reuse with a sufficiently large block size and an appropriate use by the higher-level protocol/security primitive/etc.

In 2016 Endignoux and Vizár [13] showed that the notion of OAE is also meaningful in the context of block-wise adaptive attacks. The authors proved that OAE is equivalent with an adapted version of the blockwise-adaptive AE notions from [12]. The equivalence result shows that OAE-secure schemes are safe to use in settings where block-wise adaptive attackers exist. The main observation in [12] was that if an application, such as a smart card, outputs a ciphertext block each time it is fed a plaintext block, then a potential attacker gets more power by allowing him to adaptively construct the queries block-by-block. Such attacks pose a genuine risk in the lightweight setting where small devices are for example not equipped with sufficient memory and work block-wise. The OAE notion has been adopted by several AEAD designs, amongst which the COLM CAESAR winner in the defense in depth category (satisfying the authenticity and limited privacy damage against nonce misuse).

Surprisingly, among the 32 s round candidates, only a handful come with a provable form of nonce misuse security. The SP1 mode in the Spook [6] submission achieves the nonce misuse resilience notion proposed by Ashur et al. [3]. Tiny JAMBU [16] provides authentication security when nonce is misused by adjusting the associated data to also play the role of nonce. When the nonce is repeated but AD is different, the security of the cipher would not be affected by the repeated nonce, yet when both nonce and AD repeat, the scheme does not offer strong security guarantees in the sense of OAE. Romulus-M [17] is an MRAE [25] mode which is secure against misuse (repetition) of nonces in encryption queries but prevents the scheme from being implemented in an online fashion.

Our Contributions. In this work we investigate the OAE security of the ForkAE [2] second round NIST candidate which is optimized to be efficient for short messages. We focus on ForkAE and more precisely on its SAEF mode for three reasons. First, SAEF comes with existing provable security guarantees in the nonce-respecting setting, hence its (in)security in presence of nonce misuse is an interesting open question. Secondly, SAEF is an online scheme and the OAE security as refined by Hoang et al. [15] is a natural target for the analysis of its nonce-reuse security. Finally, an analysis of SAEF sheds more light on the potential of its relatively novel and less researched underlying forkcipher primitive [2] Our results show that the SAEF mode of the ForkAE NIST second round submission is provably OAE secure without the need of applying any design modifications.

The corrected variant of the OAE definition by Hoang et al. [15] (labeled as OAE1 in the publication) was defined to handle plaintext whose length is a multiple of the underlying blocksize n. Hence we extended the formalism to deal with messages that are not n-bits aligned and target the resulting notion in our analysis. Additionally, we opt to use two separate requirements for confidentiality and authenticity of OAE schemes because this allows us to take slightly different approach for analysis for each property in favor of brevity and simplicity of the proofs. We use Coefficient H technique [22] as the main vehicle for the analysis and prove that SAEF is OAE-secure up to \(2^{n/2}\) blocks of processed data in total, where n is the block size of the underling forkcipher.

2 Preliminaries

Strings. All strings are binary strings. The set of all strings of length n (for a positive integer n) is denoted \( \{0,1\}^{n} \) and the set of all strings of all possible lengths is denoted \( \{0,1\}^{*} \). We let \( \{0,1\}^{\le n} = \bigcup _{i=0}^n \{0,1\}^{n} \). We denote by \( \text {Perm}(n) \) the set of all permutations of \( \{0,1\}^{n} \). We denote by \( \text {Func}({m,n}) \) the set of all functions with domain \( \{0,1\}^{m} \) and range \( \{0,1\}^{n} \), and we let \( \text {Inj}({m,n}) \subset \text {Func}({m,n}) \) denote the set of all injective functions with the same signature.

For a string X of \(\ell \) bits, we let X[i] denote the \( {i}^{\text {th}} \) bit of X for \(i=0,\ldots , \ell -1\) (starting from the left) and \(X[i\ldots j] = X[i]\Vert X[i+1]\Vert \ldots \Vert X[j]\) for \(0\le i< j < \ell \). We let \( \textsf {left}_{\ell }({X}) =X[0\ldots (\ell -1)]\) denote the \(\ell \) leftmost bits of X and \( \textsf {right}_{r}({X}) =X[(|X|-r)\ldots (|X|-1)]\) the r rightmost bits of X, such that \(X= \textsf {left}_{\chi }({X}) \Vert \textsf {right}_{|X|-\chi }({X}) \) for any \(0 \le \chi \le |X|\). Given a (possibly implicit) positive integer n and an \(X\in \{0,1\}^{*} \), we let \(X\Vert 10^*\) denote \(X\Vert 10^{n - (|X|\bmod {n}) -1}\) for simplicity. Given an implicit block length n, we let \( {\mathsf {pad10}} (X)=X\Vert 10^*\) return X if \(|X|\equiv 0\pmod {n}\) and \(X\Vert 10^*\) otherwise.

String Partitioning. For the rest of the section, we fix an arbitrary integer n and call it the block size. Given a string X, we let \(X_1,\ldots ,X_x,X_* \xleftarrow {n} X\) denote partitioning X into n-bit blocks, such that \(|X_i| = n\) for \(i=1,\ldots ,x\), \(0 \le |X_*| \le n\) and \(X=X_1\Vert \ldots \Vert X_x\Vert X_*\), so \(x=\max (0,\lceil X/n \rceil -1)\). We let \(|X|_n=\lceil X/n \rceil \). We let \((M',M_*) = \textsf {msplit}_{n}(M)\) (as in message split) denote a splitting of a string \(M\in \{0,1\}^{*} \) into two parts \(M'\Vert M_* = M\), such that \(|M_*| \equiv |M| \pmod {n}\) and \(0 \le |M_*| \le n\), where \(|M_*|=0\) if and only if \(|M|=0\). We let \((C',C_*,T) = \textsf {csplit}_{n}(C)\) (as in ciphertext split) denote splitting a string C of at least n bits into three parts \(C'\Vert C_*\Vert T = C\), such that \(|C_*| = n\), \(|T| \equiv |C| \pmod {n}\), and \(0 \le |T| \le n\), where \(|T|=0\) if and only if \(|C|=n\). Finally, we let \(C'_1,\ldots ,C'_m,C_*,T \leftarrow \textsf {csplit-b}_{n}(C)\) (as in \(\textsf {csplit}_{}\) to blocks) denote the result of \(\textsf {csplit}_{n}(C)\) followed by partitioning of \(C'\) into \(|C'|_n\) blocks of n bits, such that \(C' = C'_1\Vert \ldots \Vert C'_m\).

Blocks. We let \( \mathsf {B}_{n} = \{0,1\}^{n} \) denote the set of all n-bit strings (or else n-bit blocks), and define \( \mathsf {B}_{n} ^* = \{\varepsilon \} \cup \bigcup _{i=1}^{\infty } \mathsf {B}_{n} ^i\) with \(\varepsilon \) denoting the empty string of length 0. We call a string \(X \in \mathsf {B}_{n} ^*\)n-aligned”. For an n-aligned string X, \(X_i\) denotes its \( {i}^{\text {th}} \) n-bit block. For two distinct n-aligned strings \(X,Y \in \mathsf {B}_{n} ^*\) such that \(|X|_n \le |Y|_n\) w.l.o.g, we let \( \mathsf {llcp}_{n}({X,Y}) = \max \{ 1\le i \le |X|_n | X_j = Y_j \text { for } 1 \le j \le i \}\) denote the length of the longest common prefix of X and Y in n-bit blocks.

Miscellaneous. The symbol \(\bot \) denotes an error signal, or an undefined value. We denote by sampling an element X from a finite set \(\mathcal {X}\) following the uniform distribution. We let \( ({n})_{q} \) denote the falling factorial \(n\cdot (n-1)\cdot (n-2)\cdot \ldots \cdot (n-q+1)\) with \( ({n})_{0} = 1\). For a predicate \(\textsf {P}(x)\), we equate \(\textsf {P}(x) = \text {true}\) with \(\textsf {P}(x) = 1\) and \(\textsf {P}(x) = \text {false}\) with \(\textsf {P}(x) = 0\). We use lexicographic comparison of tuples of integers; i.e. \((i',j') < (i,j)\) iff \(i'<i\) or \(i'=i\) and \(j'< j\).

2.1 Coefficient H Technique

The coefficient H technique is a simple but powerful proof technique provided by Patarin in [22]. The coefficient H technique is used to prove indistinguishability of a construction from an idealized object in face of an information-theoretic adversary using the concept of “transcripts”. A transcript here is defined as complete record of the interaction of an adversary \(\mathcal {A}\) with its oracles in the indistinguishability experiment. To exemplify, if \((M_i,C_i)\) denotes the input and output of the i-th query of \(\mathcal {A}\) to its oracle over q queries then the corresponding transcript (denote it \(\tau \)) is \(\tau =\langle (M_1,C_1), \ldots ,(M_q,C_q)\rangle \). The task given to \(\mathcal {A}\) is to distinguish the real world \(\mathcal {O}_{\mathsf {real}}\) from the ideal world \(\mathcal {O}_{\mathsf {ideal}}\). Let us denote the distribution of transcripts in the real and the ideal world by \(\varTheta _{\mathsf {real}}\) and \(\varTheta _{\mathsf {ideal}}\), respectively. We say a transcript \(\tau \) is attainable if the probability of achieving \(\tau \) in the ideal world is non-zero. We also assume w.l.o.g. that \(\mathcal {A}\) does not make duplicate or prohibited queries. The fundamental Lemma of coefficient H technique can now be stated.

Lemma 1

(Fundamental Lemma of the Coefficient H Technique [22]). Consider that the set of attainable transcripts is partitioned into two disjoint sets \(\mathcal {T}_{\textsf {good}}\) and \(\mathcal {T}_{\textsf {bad}}\). Also, assume there exist \(\epsilon _1,\epsilon _2 \ge 0\) such that for any transcript \(\tau \in \mathcal {T}_{\textsf {good}}\), we have \( \frac{\Pr [\varTheta _{\mathsf {real}}=\tau ]}{\Pr [\varTheta _{\mathsf {ideal}}=\tau ]} \ge 1- \epsilon _1,\text { and } \Pr [\varTheta _{\mathsf {ideal}} \in \mathcal {T}_{\textsf {bad}}] \le \epsilon _2. \) Then, for all adversaries \(\mathcal {A}\), it holds that \( |\Pr [\mathcal {A}^{\mathcal {O}_{\mathsf {real}}}]-\Pr [\mathcal {A}^{\mathcal {O}_{\mathsf {ideal}}}]| \le \epsilon _1+\epsilon _2\).

2.2 Syntax of AEAD

Our targeted mode SAEF follows the AEAD syntax by Rogaway [24]. A nonce-based AEAD scheme is a triplet \(\varPi = (\mathcal {K},\mathcal {E},\mathcal {D})\). The key space \(\mathcal {K}\) is a finite set. The deterministic encryption algorithm \(\mathcal {E}:\mathcal {K}\times \mathcal {N}\times \mathcal {A}\times \mathcal {M}\rightarrow \mathcal {C}\) maps a secret key K, a nonce N, an associated data A and a message M to a ciphertext \(C=\mathcal {E}(K,N,A,M)\). The nonce, AD and message domains are all subsets of \( \{0,1\}^{*} \). The deterministic decryption algorithm \(\mathcal {D}:\mathcal {K}\times \mathcal {N}\times \mathcal {A}\times \mathcal {C}\rightarrow \mathcal {M}\cup \{\bot \}\) takes a tuple (KNAC) and either returns a message \(M\in \mathcal {M}\), or a distinguished symbol \(\bot \) to indicate an authentication error.

We require that for every \(M\in \mathcal {M}\), we have \( \{0,1\}^{|M|} \subseteq \mathcal {M}\) (i.e. for any integer m, either all or no strings of length m belong to \(\mathcal {M}\)) and that for all \(K,N,A,M \in \mathcal {K}\times \mathcal {N}\times \mathcal {A}\times \mathcal {M}\) we have \(|\mathcal {E}(K,N,A,M)| = |M| + \theta \) for some non-negative integer \(\theta \) called the stretch of \(\varPi \). For correctness of \(\varPi \), we require that for all \(K,N,A,M \in \mathcal {K}\times \mathcal {N}\times \mathcal {A}\times \mathcal {M}\) we have \(M=\mathcal {D}(K,N,A,\mathcal {E}(K,N,A,M))\). We let \(\mathcal {E}_K(N,A,M) = \mathcal {E}(K,N,A,M) \) and \(\mathcal {D}_K(N,A,C) = \mathcal {D}(K,N,A,C)\).

2.3 OAE Security

Our targeted notion of security is the definition of online AE (OAE) by Fleischmann et al. [11]. We use the variant of the notion described by Hoang et al. [15], extended as necessary to deal with messages that are not n-aligned. We opt for the two-requirement flavor of the notion, separating confidentiality and authenticity.

OAE Confidentiality. An online permutation [5] is a function \(\pi : \mathsf {B}_{n} ^* \rightarrow \mathsf {B}_{n} ^*\) such that (i) \(\pi \) is a length preserving permutation; i.e., for any integer \(m \ge 0\), the function \(\pi \) restricted to mn-bit inputs \(\pi (K,T,\cdot ): \mathsf {B}_{n} ^m \rightarrow \mathsf {B}_{n} ^m\) is a permutation (with a slight notation abuse); (ii) \(\pi \) preserves the length of blockwise prefix; i.e., for each \(M, M' \in \mathsf {B}_{n} ^*\), we have that \( \mathsf {llcp}_{n}({M,M'}) = \mathsf {llcp}_{n}({\pi (M), \pi (M')}) \).

We denote the set of all such permutations \( \text {OPerm}(n) \). We can define the distribution of a “random” online permutationFootnote 1 that is useful for experiments with a finite number of queries. We observe that each \(\pi \in \text {OPerm}(n) \) can be equivalently defined as a collection \((\pi _M)_{M \in \mathsf {B}_{n} ^*}\), such that for any \(M_1\Vert M_2\Vert \ldots \Vert M_r \in \mathsf {B}_{n} ^r\) we define \(\pi (M_1\Vert M_2\Vert \ldots \Vert M_r)\) as \(\pi _{\varepsilon }(M_1)\Vert \pi _{M_1}(M_2)\Vert \ldots \Vert \pi _{M_1\Vert \ldots \Vert M_{r-1}}(M_r)\), and that there is in fact a bijection between \( \text {OPerm}(n) \) and the set of all such permutation collections [5]. The expression should thus be understood as lazy sampling of those permutations in the collection \((\pi _M)_{M \in \mathsf {B}_{n} ^*}\) that are necessary to process adversarial queries.

We define OAE confidentiality of an AEAD scheme \( \varPi \) with two games, \( \mathbf {oprpf} \mathbf -real _ \varPi \) and \( \mathbf {oprpf} \mathbf -ideal _ \varPi \). In both games \( {\mathcal {A}} \) can make arbitrary chosen plaintext queries to a black box encryption oracle, in particular \( {\mathcal {A}} \) is allowed to repeat the nonces. In the game \( \mathbf {oprpf} \mathbf -real _ \varPi \), the encryption oracle faithfully implements the encryption algorithm of \( \varPi \) using a randomly sampled secret key. In the game \( \mathbf {oprpf} \mathbf -ideal _ \varPi \), upon an encryption query NAM the oracle returns \(\pi _{N,A}(M_L)\Vert f_{N,A,M}\) where \(M_L \leftarrow \textsf {left}_{\lfloor M/n \rfloor }({M}) \) is the longest n-aligned prefix of M, is a random online permutation for each \(N,A \in \mathcal {N}\times \mathcal {A}\), and is a random string for each \(N,A,M \in \mathcal {N}\times \mathcal {A}\times \mathcal {M}\) (with a length corresponding to the n-bit authentication tag and no more than \(n-1\) bit suffix of M remaining after the maximal n-aligned prefix \(M_L\)). The \( \mathbf {oprpf} \) advantage of an adversary \( {\mathcal {A}} \) against \(\varPi = \text {SAEF}[ {\mathsf {F}} , \nu ]\) can now be defined as \( {\mathbf {Adv}}_{\varPi }^{ \mathbf {oprpf} }( {\mathcal {A}} ) = \Pr [ {\mathcal {A}} ^{ \mathbf {oprpf} \mathbf -real _\varPi }] - \Pr [ {\mathcal {A}} ^{ \mathbf {oprpf} \mathbf -ideal _\varPi }] \ . \)

OAE Authenticity. The OAE authenticity notion coincides with the definition of ciphertext integrity where the adversary is allowed to repeat nonces, as first defined by Rogaway and Shrimpton [25]. A nonce-misusing chosen ciphertext attack of an adversary \( {\mathcal {A}} \) against the integrity of a nonce-based AE scheme \(\varPi \) is modeled through the security game \( \mathbf {auth} \). The adversary is given access to a pair of blackbox oracles. It can make arbitrary chosen plaintext queries to the encryption oracle, including queries with repeated nonces. \( {\mathcal {A}} \) can further make arbitrary chosen ciphertext queries to the decryption oracle with the goal of finding a forgery: a tuple that decrypts correctly but is not trivially known to be correct from the encryption queries. We define the advantage of \( {\mathcal {A}} \) in breaking the authenticity of \(\varPi \) as \( {\mathbf {Adv}}_{\varPi }^{ \mathbf {auth} }( {\mathcal {A}} ) = \Pr [ {\mathcal {A}} ^{ \mathbf {auth} _\varPi } \text {forges}] \) where “\( {\mathcal {A}} \) forges” denotes a decryption query that returns a value \(\ne \bot \). We assume w.l.o.g. that \( {\mathcal {A}} \) does not make duplicate queries.

2.4 Forkcipher

We follow the formalism by Andreeva et al. [23]. Informally, a forkcipher \( {\mathsf {F}} \) is a tweakable symmetric-key primitive which maps a secret key K, a tweak T and an input block M of n bits to two ciphertext blocks \(C_0\) and \(C_1\), such that \(C_0\) and \(C_1\) are each an (independent) permutation of M.

Syntax. Formally, a forkcipher is a pair of deterministic algorithms, the encryption algorithm: \( {\mathsf {F}} : \{0,1\}^{k} \times \mathcal {T}\times \{0,1\}^{n} \times \{ {{0}} , {{1}} , {\mathsf {b}} \} \rightarrow \{0,1\}^{n} \cup \{0,1\}^{n} \times \{0,1\}^{n} \) and the inversion algorithm: \( {\mathsf {F}^{-1}} \{0,1\}^{k} \times \mathcal {T}\times \{0,1\}^{n} \times \{ {{0}} , {{1}} \} \times \{ {\mathsf {i}} , {\mathsf {o}} , {\mathsf {b}} \} \rightarrow \{0,1\}^{n} \cup \{0,1\}^{n} \times \{0,1\}^{n} . \) The encryption algorithm takes a key K, a tweak \(\textsf {T}\in \mathcal {T}\), a plaintext block M and an output selector s, and outputs the (left) n-bit ciphertext block \(C_0\) if \(s= {{0}} \), the (right) n-bit ciphertext block \(C_1\) if \(s= {{1}} \), and both the blocks \(C_0,C_1\) if \(s= {\mathsf {b}} \). We write \( {\mathsf {F}} (K,\textsf {T},M,s) = {\mathsf {F}} _K(\textsf {T},M,s) = {\mathsf {F}} _K^{\textsf {T}}(M,s)= {\mathsf {F}} _K^{\textsf {T},s}(M)\) interchangeably. The inversion algorithm takes a key K, a tweak \(\textsf {T}\), a ciphertext block C (left/right half of output block), an indicator b of whether this should be treated as the left or the right ciphertext block and an output selector s, and outputs the plaintext block M if \(s= {\mathsf {i}} \), the other ciphertext block \(C'\) if \(s= {\mathsf {o}} \), and both blocks \(M,C'\) if \(s= {\mathsf {b}} \). We write \( {\mathsf {F}^{-1}} (K,\textsf {T},M,b,s) = {\mathsf {F}^{-1}} _K(\textsf {T},M,b,s) = {\mathsf {F}^{-1}} _K^{\textsf {T}}(M,b,s)= {\mathsf {F}^{-1}} _K^{\textsf {T},b,s}(M)\) interchangeably. We call kn and \(\mathcal {T}\) the keysize, blocksize and tweak space of \( {\mathsf {F}} \), respectively.

A tweakable forkcipher \( {\mathsf {F}} {}\) is correct if for each pair of key and tweak, the forkcipher applies two independent permutations to the input to produce the two output blocks. Formally, for each \(K \in \{0,1\}^{k} , \textsf {T} \in \mathcal {T}, M\in \{0,1\}^{n} \) and \(\beta \in \{ {{0}} , {{1}} \}\) the following conditions must be met: (i) \( {\mathsf {F}^{-1}} (K,\textsf {T}, {\mathsf {F}} (K,\textsf {T},M,\beta ),\beta , {\mathsf {i}} ) = M \), and (ii) \( {\mathsf {F}^{-1}} (K,\textsf {T}, {\mathsf {F}} (K,\textsf {T},M,\beta ),\beta , {\mathsf {o}} ) = {\mathsf {F}} (K,\textsf {T},M,\beta \oplus 1) \), and (iii) \(\left( {\mathsf {F}} (K,\textsf {T},M, {{0}} ),\right. \) \(\left. {\mathsf {F}} (K,\textsf {T},M, {{1}} )\right) = {\mathsf {F}} (K,\textsf {T},M, {\mathsf {b}} ) \), and (iv) \(\left( {\mathsf {F}^{-1}} (K,\textsf {T},C,\beta , {\mathsf {i}} ), {\mathsf {F}^{-1}} (K,\textsf {T},C,\beta , {\mathsf {o}} )\right) = {\mathsf {F}^{-1}} (K,\textsf {T},C,\beta , {\mathsf {b}} ) \). In the rest of the paper, we assume that \(\mathcal {T}= \{0,1\}^{t} \) for some positive t.

Security. The security of a correct forkcipher \( {\mathsf {F}} \) is defined through indistinguishability of the games \( \mathbf {prtfp} \mathbf -real _{ {\mathsf {F}} }\) (implementing \( {\mathsf {F}} \) algorithms faithfully) and \( \mathbf {prtfp} \mathbf -ideal _{ {\mathsf {F}} }\) which replaces \( {\mathsf {F}} \) by two tweakable random permutations in a natural way, in a chosen ciphertext attack. We define the advantage of \( {\mathcal {A}} \) as \( {\mathbf {Adv}}_{ {\mathsf {F}} }^{\mathrm {prtfp}}(\mathcal {A}) = \Pr [ {\mathcal {A}} ^{ \mathbf {prtfp} \mathbf -real _{ {\mathsf {F}} }} \Rightarrow 1] - \Pr [ {\mathcal {A}} ^{ \mathbf {prtfp} \mathbf -ideal _{ {\mathsf {F}} }} \Rightarrow 1] \) with the games defined in.

3 SAEF and Its OAE Security

SAEF (short for Sequential AE from a Forkcipher) is a nonce-based AEAD scheme. It takes as a parameter a tweakable forkcipher \( {\mathsf {F}} \) (as defined in Sect. 2.4) with \(\mathcal {T}= \{0,1\}^{t} \) for a positive \(t \le n\). \(\text {SAEF}[ {\mathsf {F}} ]=(\mathcal {K},\mathcal {E},\mathcal {D})\) has a key space \(\mathcal {K}= \{0,1\}^{k} \), nonce space \(\mathcal {N}= \{0,1\}^{t-4} \), and the AD and message spaces are both \( \{0,1\}^{*} \). The ciphertext expansion of \(\text {SAEF}[ {\mathsf {F}} ]\) is n bits. The encryption and decryption algorithms are given in Fig. 2 and the encryption algorithm is depicted in Fig. 1.

Fig. 1.
figure 1

The encryption algorithm of SAEF\([ {\mathsf {F}} ]\) mode. The bit \(\textsf {noM}=1\) iff \(|M|=0\). The picture illustrates the processing of AD when length of AD is a multiple of n (top left) and when the length of AD is not a multiple of n (top right), and the processing of the message when length of the message is a multiple of n (bottom left) and when the length of message is not a multiple of n (bottom right). The white hatching denotes that an output block is not computed.

An encryption query is processed in blocks of n bits, first AD and then the message, using a single call to the forkcipher per block. The calls are tweaked by composing: (1) the nonce followed by a 1-bit in the initial \( {\mathsf {F}} \) call of the query, and the string \(0^{t-3}\) otherwise, (2) three-bit flag f. The flag f is used to ensure proper domain separation for different “types” of blocks in the encryption algorithm. The values \(f \in \{000, 010, 011, 110, 111, 001, 100, 101\}\) are used, respectively, when processing non-final AD block, the last n-bit long AD block, the last AD block of \(<n\) bits, the last AD block of n bits to produce tag, the last AD block of \(<n\) bits to produce tag, non-final message block, the last n-bit message block and the last message block of \(<n\) bits.

One output block of every \( {\mathsf {F}} \) call is used as a chaining value, masking either the input (in AD processing) or both the input and the output (in message processing) of the following \( {\mathsf {F}} \) call. The very first \( {\mathsf {F}} \) call in each query is unmasked (but has the nonce in the tweak). The tag is the, possibly truncated, last “right” output of \( {\mathsf {F}} \) produced in the query (in case of truncation message padding is used for partial integrity check). In a decryption query, the processing is similar to the encryption, with the plaintext blocks and the chaining values in the message processing part being computed with the inverse \( {\mathsf {F}} {}\) algorithm.

3.1 Security of SAEF

Andreeva et al. proved that if nonce-uniqueness is guaranteed, SAEF achieves the standard AEAD confidentiality and integrity up to the birthday bound. There have been no investigations into the security of SAEF under nonce misuse (i.e., if nonces accidentally repeat). We state the formal claim about confidentiality and integrity of SAEF under nonce-misuse in Theorem 1.

Theorem 1

Let \( {\mathsf {F}} \) be a tweakable forkcipher with \(\mathcal {T}= \{0,1\}^{t} \). Then for any nonce-misuse adversary \( {\mathcal {A}} \) who makes at most \(q_e \le 2^{n-1}\) encryption queries, at most \(q_v\) decryption queries such that the total number of forkcipher calls induced by all the queries is at most \(\sigma \), we have

$$\begin{aligned} {\mathbf {Adv}}_{SAEF [ {\mathsf {F}} ]}^{ \mathbf {oprpf} }( {\mathcal {A}} ) \le&{\mathbf {Adv}}_{ {\mathsf {F}} {}}^{ \mathbf {prtfp} } ( {\mathcal {B}} ) + \frac{3\cdot \sigma ^2}{2^{n+1}}\\ {\mathbf {Adv}}_{SAEF [ {\mathsf {F}} ]}^{ \mathbf {auth} }( {\mathcal {A}} ) \le&{\mathbf {Adv}}_{ {\mathsf {F}} {}}^{ \mathbf {prtfp} } ( {\mathcal {C}} ) + \frac{\sigma ^2+4\cdot q_v}{2^n} \end{aligned}$$

for some adversaries \( {\mathcal {B}} \) and \( {\mathcal {C}} \), each making at most \(2 \sigma \) queries, and running in time given by the running time of \( {\mathcal {A}} \) plus \(\gamma \cdot \sigma \) for some “small” constant \(\gamma \).

The proof of Theorem 1 follows in Sects. 3.2 and 3.3.

Main Ingredients. The crux of the proof lies in adapting the analysis approach to the properties of SAEF arising from its sequential structure. These properties are best illustrated by an example. In two queries using the same nonce N, the same associated data \(A_1\Vert A_2\) of two blocks and two messages \(M^1_1\Vert M^1_2\Vert M^1_3\) and \(M^2_1\Vert M^2_2\Vert M^2_3\) of three blocks each and differing in the first message block, following the encryption algorithm in Fig. 2, the values of the \( {\mathsf {F}} \) tweak string and of \(\varDelta \) mask used to process each block of A would be identical between the two queries, as would be the tweak (equal to \(0^{n-1}1\)) and the \(\varDelta \) mask used to process the first message block \(M^1_1\) and \(M_1^2\) respectively. However, this equality of tweaks and \(\varDelta \) masks together with the non-equality \(M^1_1 \ne M_1^2\) imply that the \( {\mathsf {F}} \) input blocks \(\varDelta \oplus M^1_1\) and \(\varDelta \oplus M^2_1\) will necessarily differ, which will randomize the \(\varDelta \) masks used to process the next of each message.

Similarly, in another case with executing two same queries but with extending the AD used in the first query to \(A_1\Vert A_2\Vert 110^{n-2}\) and for the second query to \(A_1\Vert A_2\Vert 1\), the processing will again be identical for \(A_1\) and \(A_2\) and for the third block, the \(\varDelta \) masks and the \( {\mathsf {F}} \) inputs \(\varDelta \oplus 110^{n-2} = \varDelta \oplus {\mathsf {pad10}} (1)\) will be identical but the tweaks will differ. What emerges is that the internal variables of SAEF’s encryption algorithm preserve a certain “common prefix length” between the queries, and get randomized just past it. This observation is at the center of our proofs. Giving a formal definition of a common prefix between AE queries is also concise but is not straightforward, and requires a different representation of queries, which is the second main ingredient of our proofs.

3.2 Integrity

We switch to an alternative definition of AEAD integrity through indistinguishability, which is equivalent with the notion introduced in Sect. 3. We define two games, \( \mathbf {auth} \mathbf -real _ \varPi \) and \( \mathbf {auth} \mathbf -ideal _ \varPi \). In both games \( {\mathcal {A}} \) is given access to an encryption and a decryption oracle. In the game \( \mathbf {auth} \mathbf -real _ \varPi \), both oracles faithfully implement the respective algorithms of SAEF using the same randomly sampled secret key, except that the decryption oracle returns \(\top \) in case of a successful forgery, and \(\bot \) otherwise. In the game \( \mathbf {auth} \mathbf -ideal _ \varPi \), the encryption oracle is the same as in \( \mathbf {auth} \mathbf -real _ \varPi \) but the decryption oracle always returns \(\bot \). We claim that \( {\mathbf {Adv}}_{\varPi }^{ \mathbf {auth} }( {\mathcal {A}} ) = \Pr [ {\mathcal {A}} ^{ \mathbf {auth} \mathbf -real _\varPi }] - \Pr [ {\mathcal {A}} ^{ \mathbf {auth} \mathbf -ideal _\varPi }] \ . \)

It is easy to see that the equality holds, by establishing inequalities in both directions. An adversary \( {\mathcal {A}} \) playing the game \( \mathbf {auth} _ \varPi \) can be used to construct a distinguishing adversary \( {\mathcal {B}} \) which forwards \( {\mathcal {A}} \)’s queries and outputs 1 iff \( {\mathcal {A}} \) forges, achieving the same advantage as \( {\mathcal {A}} \). In the other direction, we can construct an adversary \( {\mathcal {A}} \) for game \( \mathbf {auth} _ \varPi \) from an indistinguishability adversary \( {\mathcal {B}} \), which forwards \( {\mathcal {B}} \)’s queries and automatically wins if \( {\mathcal {B}} \) produces a valid forgery. \( {\mathcal {A}} \) achieves the same advantage as \( {\mathcal {B}} \), because if no forgery occurs, the games \( \mathbf {auth} \mathbf -real _ \varPi \) and \( \mathbf {auth} \mathbf -ideal _ \varPi \) are indistinguishable.

Replacing \( {\mathsf {F}} \) . We first replace \( {\mathsf {F}} \) with a pair of independent random tweakable permutations and and let \(\text {SAEF}[(\pi _0,\pi _1)]\) denote the SAEF mode that uses \(\pi _0,\pi _1\) instead of \( {\mathsf {F}} \), which yields \( {\mathbf {Adv}}_{\text {SAEF}[ {\mathsf {F}} ]}^{ \mathbf {auth} }( {\mathcal {A}} ) \le {\mathbf {Adv}}_{ {\mathsf {F}} {}}^{ \mathbf {prtfp} } ( {\mathcal {B}} ) + {\mathbf {Adv}}_{ \text {SAEF}[(\pi _0,\pi _1)]}^{ \mathbf {auth} }( {\mathcal {A}} ). \)

The adversary is thus left with the goal of distinguishing between the games \( \mathbf {auth} \mathbf -real _{\text {SAEF}[(\pi _0,\pi _1)]}\) (called the “real-int world”) and \( \mathbf {auth} \mathbf -ideal _{\text {SAEF}[(\pi _0,\pi _1)]}\) (called the “ideal-int world”). Hence, we want to bound \( {\mathbf {Adv}}_{\text {SAEF}[(\pi _0,\pi _1)]}^{ \mathbf {auth} }( {\mathcal {A}} ) = \Pr [ {\mathcal {A}} ^{ \mathbf {auth} \mathbf -real _{\text {SAEF}[(\pi _0,\pi _1)]}}] - \Pr [ {\mathcal {A}} ^{ \mathbf {auth} \mathbf -ideal _{\text {SAEF}[(\pi _0,\pi _1)]}}] . \)

Transcripts. Following the coefficients H technique [22], we characterize the interactions of \( {\mathcal {A}} \) with its oracles in a transcript:

$$ \tau = \langle (N^i,A^i,M^i,C^i)_{i=1}^{q_e},(\bar{N}^i,\bar{A}^i,\bar{C}^i,b^i)_{i=1}^{q_v}\rangle $$

In the \( {i}^{\text {th}} \) query to the encryption oracle \((N^i,A^i,M^i)\), returning the ciphertext \(C^i\), SAEF internally processes \(A^i,M^i\) and \(C^i\) in blocks \(A^i_1, \ldots , A^i_{a^i}, A^i_*\), and \(M^i_1, \ldots , M^i_{m^i}, M^i_*\), and \(C^i_1, \ldots , C^i_{m^i}, C^i_*, T^i\) respectively (defined according to the encryption algorithm in Fig. 2). We have \(a^i\) and \(m^i\) equal to, respectively, the length of \(A^i\) and \(M^i\) in n-bit. SAEF additionally uses whitening masks, denoted here \(\varDelta ^i_1, \ldots , \varDelta ^i_{a^i+1}\) for the whitening masks used to process \(A^i_1, \ldots , A^i_{a^i}, A^i_*\) respectively, and \(\varDelta ^i_{a^i+2}, \ldots , \varDelta ^i_{a^i+m^i+2}\) for the whitening masks used to process the blocks \(M^i_1, \ldots , M^i_{m^i}, M^i_*\) respectively.

Similarly, in the \( {i}^{\text {th}} \) decryption query \((\bar{N}^i,\bar{A}^i,\bar{C}^i)\), returning \(b^i \in \{\top ,\bot \}\) SAEF internally processes \(\bar{A}^i\) and \(\bar{C}^i\) in blocks, denoted as \(\bar{A}^i_1, \ldots , \bar{A}^i_{\bar{a}^i}, \bar{A}^i_*\) and \(\bar{C}^i_1, \ldots , \bar{C}^i_{\bar{m}^i}, \bar{C}^i_*, \bar{T}^i\), where \(\bar{a}^i\) and \(\bar{m}^i\) are respectively equal to the length of \(\bar{A}^i\) and the length of \(\bar{C}^i\) in n-bit blocks (excluding the tag from the count). Additionally, SAEF internally computes the plaintext blocks \(\bar{M}^i_1, \ldots , \bar{M}^i_{\bar{m}^i}, \bar{M}^i_*\) as well as \(\bar{\varDelta }^i_1, \ldots , \bar{\varDelta }^i_{\bar{a}^i+1}\), the whitening masks used to process \(\bar{A}^i_1, \ldots , \bar{A}^i_{\bar{a}^i}, \bar{A}^i_*\) respectively, and \(\bar{\varDelta }^i_{\bar{a}^i+2}, \ldots , \bar{\varDelta }^i_{\bar{a}^i+\bar{m}^i+2}\) ,the whitening masks used to process the blocks \(\bar{C}^i_1, \ldots , \bar{C}^i_{\bar{m}^i}, \bar{C}^i_*\) respectively.

Additional Information. To simplify the proofs, we additionally provide the adversary with all the encryption masks \(\varDelta ^i_j\), all the decryption masks \(\bar{\varDelta }^i_j\) and internally computed plaintexts \(\bar{M}^i_j\) when it has made all its queries and only the final response is pending.

In the real-int world, all these variables are internally computed by the oracles faithfully evaluating SAEF. In the ideal-int world, the encryption also evaluates SAEF, so \(\varDelta ^i_j\) masks are defined, but the decryption oracle does not make any computations, hence \(\bar{\varDelta }^i_j\) and \(\bar{M}^i_j\) are not defined. We therefore have to define their sampling, to be done at the end of the experiment (thus having no influence of adversarial queries).

For \(j=1\), we have \(\bar{\varDelta }^i_1 = 0^n\) for \(1 \le i \le q_v\). We sample each of the remaining \(\bar{\varDelta }^i_j\) mask uniformly, except when its value is trivially determined due to a “common prefix” with an encryption, or decryption query (defined shortly). Once the masks are sampled, we run the SAEF decryption algorithm using \(\pi _0\) and these masks to compute \(\bar{M}^i_j\). Clearly, this give away of additional information can only increase the adversary’s advantage and thus can be considered here for upper bounding the above mentioned adversarial advantage.

Block-Tuple Representation. The \( {i}^{\text {th}} \) encryption query can be equivalently represented as \((\textsf {T}^i_j, \varDelta ^i_j, X^i_j, Y^i_j)_{j=1}^{\ell ^i}, T^i\), such that \(\ell ^i = a^i+m^i+2\) and each of the \(\ell ^i\) quadruples represents the processing done in \( {j}^{\text {th}} \) call to the forkcipher in the query, consisting of the string \(\textsf {T}^i_j\) used as forkcipher tweak, the current whitening mask \(\varDelta ^i_j\), the (possibly padded) associated data/plaintext block \(X^i_j\) and the empty/ciphertex block \(Y^i_j\). In more detail:

  • In the first block, we always have \(\textsf {T}^i_1 = N\Vert 1\Vert F\) for a flag \(F\in \{0,1\}^{3} \) and \(\varDelta ^i_1 = 0^n\). For blocks with \(j> 1\) we have \(\textsf {T}^i_j = 0^{t-3}\Vert F\) for an \(F\in \{0,1\}^{3} \).

  • If \(|A| > 0\), for \(1 \le j \le a^i\) we have \(X^i_j = A^i_j\), \(Y^i_j = \varepsilon \) and \(F=000\). For \(j=a^i+1\) we have \(X^i_j = {\mathsf {pad10}} (A^i_*)\), \(Y^i_j = \varepsilon \) and \(F\in \{0,1\}^{3} \) as defined in Fig. 2.

  • If \(|M| > 0\), for \(a^i + 2 \le j < \ell ^i\) we have \(X^i_j = M^i_j\), \(Y^i_j = C^i_j\) and \(F=001\). For \(j=\ell ^i\) we have \(X^i_j = {\mathsf {pad10}} (M^i_*)\), \(Y^i_j = C^i_*\) and \(F\in \{0,1\}^{3} \) as defined in Fig. 2.

  • If \(A=M=\varepsilon \), we have \(j=\ell ^i=1\), \(X^i_j= {\mathsf {pad10}} (\varepsilon ), Y^i_j = \varepsilon \) and \(F=111\).

We call this the block-tuple representation.

We similarly define the block-tuple representation \((\bar{\textsf {T}}^i_j, \bar{\varDelta }^i_j, \bar{X}^i_j, \bar{Y}^i_j)_{j=1}^{\bar{\ell }^i}, \bar{T}^i, b^i\) for decryption queries with \(\bar{\ell }^i = \bar{m}^i + \bar{a}^i + 2\). We further streamline the notation by re-indexing the decryption queries from \(q_e+1\) to \(q_e + q_v\), and dispensing of the bar above the variables. The decryption queries are thus denoted as \(({\textsf {T}}^i_j, \varDelta ^i_j, X^i_j, Y^i_j)_{j=1}^{\ell ^i}, T^i, b^i\) for \(q_e+1 \le i \le q_e + q_v\).

Proposition 1

The transformation of an extended transcript from the default to the block-tuple representation is injective.

The proof of Proposition 1 is given in Appendix A.

Blockwise Common Prefix of Queries. This notation allows for the following natural definition of longest common blockwise prefix between two queries. We define the longest common prefix of the \( {i}^{\text {th}} \) and \( {i'}^{\text {th}} \) query (in the block-tuple notation) with \(\ell ^i \le \ell ^{i'}\) w.l.o.g. as

$$ \mathsf {llcp}_{n}({i,i'}) = \max \{ 1\le u \le \ell ^i | (\textsf {T}^i_j, \varDelta ^i_j, X^i_j, Y^i_j) = (\textsf {T}^{i'}_j, \varDelta ^{i'}_j, X^{i'}_j, Y^{i'}_j) \text { for } 1 \le j \le u \}. $$

Note that this definition covers common blockwise prefix between two encryption queries (with \(i,i' \le q_e\)), two decryption queries (with \(i, i' > q_e)\)) and between an encryption and a decryption query (with \(1 \le i' \le q_e < i \le q_e + q_v\) w.l.o.g.). Informally, \( \mathsf {llcp}_{n}({i,i'}) \) counts how many internal chaining values (the \(\varDelta \) masks) are trivially equal between the \( {i}^{\text {th}} \) and \( {i'}^{\text {th}} \) encryption query. For example, if the nonces \(N^i\) and \(N^{i'}\) are distinct we have \( \mathsf {llcp}_{n}({i,i'}) = 0\). If we have two queries with \(N^i = N^{i'}\) and one having AD \(A^{i'} = A^i\Vert M^i_1\) equal to the AD of the other appended with its first message block, we will still have \( \mathsf {llcp}_{n}({i,i'}) = a^i + 1\), thanks to including the tweak string in the block tuples. We finally define the length of the longest common blockwise prefix of a query with all previous queries as \( \mathsf {llcp}_{n}({i}) = \max _{1<i'<i} \mathsf {llcp}_{n}({i,i'}) \). Note that for a decryption query, all the encryption queries are always taken into account (due to our convention about query indexing).

Sampling of \(\varDelta \) Masks. Using this notation, we now precisely define the sampling of \(\varDelta ^i_j\) masks in decryption queries (i.e., for \( q_e < i \le q_e+q_v\)) of the ideal-int world. In the \( {i}^{\text {th}} \) decryption query, for \(1 \le j \le \mathsf {llcp}_{n}({i}) + 1\), we let \(\varDelta ^i_j = \varDelta ^{i'}_j\) for the smallest \(i' < i\) such that \( {i'}^{\text {th}} \) query has \( \mathsf {llcp}_{n}({i}) = \mathsf {llcp}_{n}({i, i'}) \). For \( \mathsf {llcp}_{n}({i}) + 1 < j \le \ell ^i\), \(\varDelta ^i_j\) is sampled uniformly at random. In other words, \(\varDelta \) masks that are trivially known to be equal to some previous mask due to the iterative nature of SAEF are simply set to that value, and the rest is sampled uniformly.

Extended Transcripts. With the new notation in mind, the extended transcripts can be re-defined as

$$ \tau = \left\langle \left( \left( \textsf {T}^i_j, \varDelta ^i_j, X^i_j, Y^i_j\right) _{j=1}^{\ell ^i}, T^i\right) _{i=1}^{q_e},\left( \left( {\textsf {T}}^i_j, \varDelta ^i_j, X^i_j, Y^i_j\right) _{j=1}^{\ell ^i}, T^i, b^i\right) _{i=q_e + 1}^{q_e + q_v}\right\rangle \ . $$

Note that \(q_e,q_v,a\) and m here are themselves random variables and thus can vary for distinct attainable transcripts. However, due to the assumption that the adversary can make at most \(\sigma \) block queries, we always have \(\sum _{i=1}^{q_e+q_v}(a^i+m^i+2) = \sigma \).

The following corollary shows that switching to block-tuple representation does not have an impact of the result of the analysis and in particular on the bounds derived using the block-tuple representation.

Corollary 1

Due to Proposition 1, it is impossible for two distinct transcripts \( \tau = \langle (N^i,A^i,M^i,C^i)_{i=1}^{q_e},(\bar{N}^i,\bar{A}^i,\bar{C}^i,b^i)_{i=1}^{q_v}\rangle \;\) and \(\;\tau ' = \langle ({N'}^i,{A'}^i,{M'}^i,{C'}^i)_{i=1}^{q'_e},\) \(({\bar{N'}}^i,{\bar{A'}}^i,{\bar{C'}}^i,{b'}^i)_{i=1}^{q'_v}\rangle \) to have the same block-tuple representation.

Coefficient-H. Let \(\varTheta _{rein}\) and \(\varTheta _{idin}\) represent the distribution of the transcript in the real-int world and the ideal-int world, respectively.

The proof relies on the fundamental lemma of the coefficient H technique defined as Lemma 1 above. We say an attainable transcript \(\tau \) is bad if one of the following conditions occurs:

  • \(\mathsf {BadT_1}\) a.k.a. “Input Collision”: There exists \((i',j') < (i,j)\) (the block indexed by \((i',j')\) precedes (ij)) such that \(1 \le i \le q_e + q_v\), \( \mathsf {llcp}_{n}({i}) < j \le \ell ^i\) is not in the longest common prefix of the \( {i}^{\text {th}} \) query, , and the (ij) block call has tweak-input collision with the \((i',j')\) block call, i.e., \(\textsf {T}^i_j = \textsf {T}^{i'}_{j'}\) and \(X^i_j \oplus \varDelta ^i_j = X^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\).

  • \(\mathsf {BadT_2}\) a.k.a. “Mask Collision”: There exists \((i',j') < (i,j)\) such that \(1 \le i \le q_e + q_v\), \( \mathsf {llcp}_{n}({i})< j < \ell ^i\) (not in the longest common prefix), and both the block calls have the same tweaks \(\textsf {T}^i_j = \textsf {T}^{i'}_{j'}\) and different inputs \(X^i_j \oplus \varDelta ^i_j \ne X^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\), however, the following masks \(\varDelta ^i_{j+1} = \varDelta ^{i'}_{j' +1}\) collide. (Note that such a collision cannot occur in the real-int world where the masks are generated with permutation).

  • \(\mathsf {BadT_3}\) a.k.a. “Forgery”: There exists \(q_e+1 \le i \le q_e+q_v\) such that for \(j = \ell ^i\) we have any of the following:

    1. Case 1.

      The last bit of \(\textsf {T}^i_j\) is 0 and \(\pi _{\textsf {T}^i_j,1}(X^i_j \oplus \varDelta ^i_j)=T^i\).

    2. Case 2.

      The last bit of \(\textsf {T}^i_j\) is 1, \( \textsf {right}_{n-|T^i|}({X^i_j}) = 10^{n - |T^i| - 1}\) and \( \textsf {left}_{|T^i|}({\pi _{\textsf {T}^i_j,1}(X^i_j \oplus \varDelta ^i_j)}) =T^i.\)

We let \(\mathcal {T}_{\text {bad}}\) be the set of “bad” transcripts defined as the subset of attainable transcripts for which the transcript predicate \(\mathsf {BadT}(\tau ) = (\mathsf {BadT_1}(\tau ) \vee \mathsf {BadT_2}(\tau ) \vee \mathsf {BadT_3}(\tau ))\) evaluates true. We define \(\mathcal {T}_{\text {good}}\) as the set of attainable transcripts which are not in the set \(\mathcal {T}_{\text {bad}}\) (and therefore from now on called as good transcripts).

Lemma 2

For \(\mathcal {T}_{\text {bad}}\) above and \(q_e \le 2^{n-1}\), we have \(\Pr [\varTheta _{idin} \in \mathcal {T}_{\text {bad}}] \le \frac{\sigma ^2}{2^n}+\frac{4\cdot q_v}{2^n}~.\)

Proof

BadT\(_1\). For any transcript in \(\mathcal {T}_{\text {bad}}\) with \(\mathsf {BadT}_1\) set to 1, we know that there exists at least one pair of block indices \((i',j')< (i,j)\) such that \( \mathsf {llcp}_{n}({i}) <j\le \ell ^i\) and \(\varDelta ^i_j\oplus \varDelta ^{i'}_{j'} = X^i_j \oplus X^{i'}_{j'}.\)

Note that for all \(i' < i\) and \(j = j' = \mathsf {llcp}_{n}({i}) +1\), we have \(\varDelta ^i_j = \varDelta ^{i'}_{j'}\) but \(X^i_j \ne X^{i'}_{j'}\) and thus for such cases the probability that the above equality occurs is 0. On the other hand, for all \(i' < i\) and \(j' \ne j\) or \(j \ne \mathsf {llcp}_{n}({i}) +1\), the two masks are sampled uniformly and independently in \(\varTheta _{idin}\). Since there are total \(\sigma \) possible values of (ij) in a transcript, each having no more than \(\sigma \) possible values of \((i',j')\), we get \(\Pr [\mathsf {BadT_1}(\varTheta _{idin})= 1] \le \frac{\sigma ^2}{2}\cdot \max \left\{ 0,\frac{1}{2^n}\right\} =\frac{\sigma ^2}{2^{n+1}}~.\)

BadT\(_2\). Similarly, for any transcript in \(\mathcal {T}_{\text {bad}}\) with \(\mathsf {BadT}_2\) set to 1, we know that there exists at least one pair \((i',j') < (i,j)\) such that \( \mathsf {llcp}_{n}({i})<j<\ell ^i\) and \(\varDelta ^{i}_{j+1}\oplus \varDelta ^{i'}_{j'+1} = 0.\)

Note that from the definition of the predicate \(\mathsf {BadT}_2\) we have \(j+1 \ne \mathsf {llcp}_{n}({i}) +1\). This means that \(\varDelta ^i_{j+1}\) is distributed uniformly and independently of \(\varDelta ^{i'}_{j'+1}\). Since there are total \(\sigma \) possible values of (ij) in a transcript, each with no more than \(\sigma \) possible values of \((i',j')\), we get \(\Pr [\mathsf {BadT_2}(\varTheta _{idin}) = 1] \le \frac{\sigma ^2}{2^{n+1}}.\)

BadT\(_3\). Now, for any transcript in \(\mathcal {T}_{\text {bad}}\) with \(\mathsf {BadT}_3\) set to 1 and \(\mathsf {BadT}_1\) set to 0, we know for \(\varTheta _{idin}\) one of the following can happen:

For some \(i' \le q_e\), \(j = \ell ^i\) and \(j' = \ell ^{i'}\), we have \(j = j' = \mathsf {llcp}_{n}({i}) \). Clearly, in such a case \(\varDelta ^i_j = \varDelta ^{i'}_{j'}\), \(X^i_j = X^{i'}_{j'}\) but \(T^i \ne T^{i'}\). \(T^{i'}\) being the correct tag for the given ciphertext, \(T^i \ne T^{i'}\) cannot trigger \(\textsf {BadT}_3\), yielding 0 probability.

For some \(i' \le q_e\), \(j = \ell ^i\) and \(j' = \ell ^{i'}\), we have \(j = j' = \mathsf {llcp}_{n}({i}) +1\). We have \(\varDelta ^i_j = \varDelta ^{i'}_{j'}\) but \(X^i_j \ne X^{i'}_{j'}\) and thus the probability of any of the two conditions of \(\mathsf {BadT}_3\) occurring for a given query is at most \(4/2^n\) assuming \(q_e \le 2^{n-1}\). For the first condition, this holds as every tag there is produced with a tweak used at most once per query, corresponding to a probability \(1/(2^n-q_e) \le 2/2^n\). For the second condition, we upper bound the product of the probabilities of having the correct padding in the block \(X^i_j\) (at most \(2^{|T^i|}/(2^n-q_e)\)), and of having the correct truncated tag (at most \(2^{n-|T^i|}/(2^n-q_e)\)).

For all \(i' \le q_e\), we have \(j > \mathsf {llcp}_{n}({i,i'}) +1\). We know that the \(\varDelta ^i_j\) is not inherited from an encryption query and is therefore sampled uniformly in \(\varTheta _{idin}\). The first condition of \(\mathsf {BadT}_3\) thus occurs with a probability \(1/2^n\). For the second condition, the correct padding is found with probability \(1/2^{n-|T^i|}\) (using the randomness of \(\varDelta ^i_j\)), and the correct tag is found with probability at most \(2^{n-|T^i|}/(2^n-q_e)\), thanks to freshness of \(X^i_j \oplus \varDelta ^i_j\), relying on \(\mathsf {BadT}_1(\varTheta _{idin})=0\) w.l.o.g., yielding a probability of at most \(2/2^n\).

Since there are total \(q_v\) possible decryption queries, we get \(\Pr [\mathsf {BadT_3}(\varTheta _{idin}) = 1|\mathsf {BadT_1}(\varTheta _{idin}) = 0] \le q_v\cdot \max \left\{ 0,\frac{4}{2^n},\frac{2}{2^n}\right\} = \frac{4\cdot q_v}{2^n}~.\) Hence, we obtain by the union bound that \(\Pr [\varTheta _{idin} \in \mathcal {T}_{\text {bad}}] \le \frac{\sigma ^2}{2^n} + \frac{4\cdot q_v}{2^n}~.\)

Lemma 3

Let \(\tau \in \mathcal {T}_{\text {good}}\) i.e. \(\tau \) is a good transcript. Then \(\frac{\Pr [\varTheta _{rein}=\tau ]}{\Pr [\varTheta _{idin}=\tau ]} \ge 1.\)

Proof

Note that a good transcript has the following two properties (i) For each \((i',j') < (i,j)\) if (ij) is not in the longest common prefix of the two queries i.e. \( \mathsf {llcp}_{n}({i, i'})<j<\ell ^i\) and both \(\pi _0\) calls have same tweaks (i.e. \(\textsf {T}^i_j = \textsf {T}^{i'}_{j'}\)) then both calls will always have different inputs and different outputs. (ii) For each query to the decryption oracle i.e. \(1\le i \le q_v\), the transcript contains \(b^i=\perp \) in the decryption result i.e. the conditions for a successful verification are not met.

The probability to obtain a good transcript \(\tau \) in the real-int and the ideal-int worlds can now be computed. Let \(\tau _e\) and \(\tau _d\) denote the two parts of a transcript \(\tau \) consisting respectively encryption and decryption queries, so that \(\tau = \langle \tau _e, \tau _d\rangle \). With a slight abuse of notation, we have \(\Pr [\varTheta _{rein}=\tau ] = \Pr [\varTheta _{rein, e}=\tau _e]\cdot \Pr [\varTheta _{rein, d}=\tau _d | \varTheta _{rein, e}=\tau _e]\) and \(\Pr [\varTheta _{idin}=\tau ] = \Pr [\varTheta _{idin, e}=\tau _e]\cdot \Pr [\varTheta _{idin, d}=\tau _d| \varTheta _{idin, e}=\tau _e] \) and consequently

$$\begin{aligned} \frac{\Pr [\varTheta _{rein, e}=\tau _e]\cdot \Pr [\varTheta _{rein, d}=\tau _d | \varTheta _{rein, e}=\tau _e]}{\Pr [\varTheta _{idin, e}=\tau _e]\cdot \Pr [\varTheta _{idin, d}=\tau _d| \varTheta _{idin, e}=\tau _e]} = \frac{\Pr [\varTheta _{rein, d}=\tau _d | \varTheta _{rein, e}=\tau _e]}{\Pr [\varTheta _{idin, d}=\tau _d| \varTheta _{idin, e}=\tau _e]} \ . \end{aligned}$$

This is because the encryption oracles in the real-int world and in the ideal-int world are identical, and so \(\Pr [\varTheta _{rein, e}=\tau _e] = \Pr [\varTheta _{idin, e}=\tau _e]\). Further abusing notation, we let \(\tau _{d,\varDelta }\) denote the marginal event of all \(\varDelta \) masks in the decryption queries (as variables) being equal to the values in the transcript. We have \(\Pr [\varTheta _{rein, d}=\tau _d | \varTheta _{rein, e}=\tau _e, \varTheta _{rein, d, \varDelta }=\tau _{d,\varDelta }] = \Pr [\varTheta _{idin, d}=\tau _d| \varTheta _{idin, e}=\tau _e, \varTheta _{idin, d, \varDelta }=\tau _{d,\varDelta }] \) because both sides of this equality correspond to mappings defined with random permutations with the input-output pairs fixed from the encryption part in both worlds. Further, using this equality, we get

$$\begin{aligned} \frac{\Pr [\varTheta _{rein, d}=\tau _d | \varTheta _{rein, e}=\tau _e]}{\Pr [\varTheta _{idin, d}=\tau _d| \varTheta _{idin, e}=\tau _e]} = \frac{\Pr [\varTheta _{rein, d, \varDelta }=\tau _{d,\varDelta } | \varTheta _{rein, e}=\tau _e]}{\Pr [\varTheta _{idin, d, \varDelta }=\tau _{d,\varDelta }| \varTheta _{idin, e}=\tau _e]}~. \end{aligned}$$

Let us now consider that \(\tau \) has in total \(\delta \) many \(\varDelta \)s that are fixed/predefined due to all internal common prefixes. Clearly, one can write that \(\delta = \sum ^{q_e+q_v}_{i=1}( \mathsf {llcp}_{n}({i}) +1)\) (the extra 1 here stands for the \(\varDelta ^i_1\) which is always fixed to 0). In the ideal-int world, since the \(\varDelta \)s corresponding to these \((\sigma -\delta )\) unique block calls are sampled uniformly and independently and all decryption oracle results are \(\perp \), one has \( \Pr [\varTheta _{idin, d, \varDelta }=\tau _{d,\varDelta }| \varTheta _{idin, e}=\tau _e] = \frac{1}{(2^n)^{\sigma -\delta }}~. \)

In the real-int world, these \((\sigma -\delta )\) \(\varDelta \)s are no longer uniformly distributed but are defined using the random tweakable permutation \((\pi _0,\pi _1)\) with at least \(g_1=\sum _{i=1}^{q_e+q_v}(a^i-1)\) block calls with the tweak \(0^n\) and at least \(g_2=\sum _{i=1}^{q_e+q_v}(m^i-1)\) block calls with the tweak \(0^{n-1}\Vert 1\). Thus, one has \( \Pr [\varTheta _{rein, d, \varDelta }=\tau _{d,\varDelta } | \varTheta _{rein, e}=\tau _e] \ge \frac{1}{(2^n)_{g_1}(2^n)_{g_2}(2^n)^{\sigma -\delta -g_1-g_2}}. \)

Note that the above expression is not an equality and only gives an upper bound on the targeted probability as there are more permutation calls which can have tweak collisions (To exemplify, the first block calls of any set of queries will have same tweaks if they all have same nonce). Now, from the above expressions, we get

$$ \frac{\Pr [\varTheta _{rein}=\tau ]}{\Pr [\varTheta _{idin}=\tau ]} \ge \frac{(2^n)^{\sigma -\delta }}{(2^n)_{g_1}(2^n)_{g_2}(2^n)^{\sigma -\delta -g_1-g_2}} = \frac{(2^n)^{g_1}(2^n)^{g_2}}{(2^n)_{g_1}(2^n)_{g_2}} \ge 1~. $$

Combining the results of Lemma 2 and 3 (taking \(\epsilon _1=0\)) into Lemma 1, we obtain the upper bound \({\mathbf {Adv}}_{\text {SAEF}[(\pi _0,\pi _1)]}^{ \mathbf {auth} }( {\mathcal {A}} ) \le \frac{\sigma ^2}{2^n} +\frac{4\cdot q_v}{2^n}\)

and thus the integrity result of the Theorem 1.

3.3 Confidentiality

Replacing\( {\mathsf {F}} \) . We replace \( {\mathsf {F}} \) with a pair of independent random tweakable permutations and and let \(\text {SAEF}[(\pi _0,\pi _1)]\) denote the SAEF mode that uses \(\pi _0,\pi _1\) instead of \( {\mathsf {F}} \), which yields \( {\mathbf {Adv}}_{\text {SAEF}[ {\mathsf {F}} ]}^{ \mathbf {oprpf} }( {\mathcal {A}} ) \le {\mathbf {Adv}}_{ {\mathsf {F}} {}}^{ \mathbf {prtfp} } ( {\mathcal {B}} ) + {\mathbf {Adv}}_{ \text {SAEF}[(\pi _0,\pi _1)]}^{ \mathbf {oprpf} }( {\mathcal {A}} ). \)

With this replaced “\(\mathsf {F}\)” scenario, the adversary is left with the goal of distinguishing between \( \mathbf {oprpf} \mathbf -real _{\text {SAEF}[(\pi _0,\pi _1)]}\) (called the “real-conf world”) and \( \mathbf {oprpf} \mathbf -ideal _{\text {SAEF}[(\pi _0,\pi _1)]}\) (called the “ideal-conf world”). We now need to bound \( {\mathbf {Adv}}_{\text {SAEF}[(\pi _0,\pi _1)]}^{ \mathbf {oprpf} }( {\mathcal {A}} ) = \Pr [ {\mathcal {A}} ^{ \mathbf {oprpf} \mathbf -real _{\text {SAEF}[(\pi _0,\pi _1)]}}] - \Pr [ {\mathcal {A}} ^{ \mathbf {oprpf} \mathbf -ideal _{\text {SAEF}[(\pi _0,\pi _1)]}}] \ . \)

Transcripts and Additional Information. As before, we record the interaction of \( {\mathcal {A}} \) with its oracle in a transcript containing the encryption queries and their responses as \( \tau = \langle (N^i,A^i,M^i,C^i)_{i=1}^{q_e}\rangle \) such that the AD A, message M and ciphertext C are further partitioned in blocks as described in the integrity proof. Note that as before, \(q_e,a\) and m here are themselves random variables and thus can vary for distinct attainable transcripts. We also assume that the adversary can make at most \(\sigma ' \le \sigma \) block queries. Thus, we always have \(\sum _{i=1}^{q_e}(a^i+m^i+2) \le \sigma \).

To simplify the proofs, we again provide the adversary with additional information: (i) \(\varDelta ^i_j\) masks for \(1\le i \le q_e\) and for \(1\le j \le \ell ^i = a^i + m^i + 2\) that are internally computed by the encryption algorithm of SAEF. (ii) Tag bits that would normally be discarded by truncation (if any), i.e., we now have \(|T^i| = n\) for \(1 \le i \le q_e\).

This information is given to the adversary after it has made all its queries. These masks, as well the bits extending the tags to n bits, are well-defined in real-conf world, which faithfully implements SAEF encryption algorithm. In the ideal-conf world, however, they are not defined (as ciphertexts are directly computed by an online permutation and a random function, and the tags are directly sampled with the desired length) and thus the masks are sampled uniformly at random while maintaining the consistency with SAEF’s prefix preservation (defined in detail below), and each authentication tag is simply extended by 0 to \(n - 1\) uniform bits as necessary.

Block-Tuple Representation. As before, we use the block-tuple representation. We represent the \( {i}^{\text {th}} \) encryption query as \((\textsf {T}^i_j, \varDelta ^i_j, X^i_j, Y^i_j)_{j=1}^{\ell ^i}, T^i\), with \(\ell ^i = a^i+m^i+2\) and each of the \(\ell ^i\) quadruples consisting of the string \(\textsf {T}^i_j\) used as forkcipher tweak, the \( {j}^{\text {th}} \) whitening mask \(\varDelta ^i_j\), the (possibly padded) associated data/plaintext block \(X^i_j\) and the empty/ciphertex block \(Y^i_j\).

We reuse the same definition of the length of longest common blockwise prefix between the \( {i}^{\text {th}} \) and the \( {i'}^{\text {th}} \) query \( \mathsf {llcp}_{n}({i,i'}) \) and between the \( {i}^{\text {th}} \) query and all preceding queries \( \mathsf {llcp}_{n}({i}) \).

Sampling of \(\varDelta \) Masks. Using the block-tuple representation, we now detail the sampling of \(\varDelta \) masks in the ideal world. For each \(1 \le i \le q_e\), we let \(\varDelta ^i_j = \varDelta ^{i'}_j\) for \(1 \le j \le \mathsf {llcp}_{n}({i}) + 1\) with the smallest \(i' < i\) such that \( {i'}^{\text {th}} \) query has \( \mathsf {llcp}_{n}({i}) = \mathsf {llcp}_{n}({i, i'}) \). For \( \mathsf {llcp}_{n}({i}) + 1 < j \le \ell ^i\) we sample the mask \(\varDelta ^i_j\) uniformly at random.

Extended Transcripts. The extended transcripts available to the adversary at the decision-making time are denoted as \( \tau = \langle ((\textsf {T}^i_j, \varDelta ^i_j, X^i_j, Y^i_j)_{j=1}^{\ell ^i}, T^i)_{i=1}^{q_e} \rangle \ . \)

Coefficient H. Let \(\varTheta _{reco}\) and \(\varTheta _{idco}\) represent the distribution of the transcripts in the real-conf world and the ideal-conf world, respectively.

The proof relies on the fundamental lemma of the coefficient H technique defined as Lemma 1 above. We represent the \(j^{th}\) block call of the \(i^{th}\) query in a transcript by the index tuple (ij). We say an attainable transcript \(\tau \) is bad if one of the following conditions occur:

  • \(\mathsf {BadT'_1}\) a.k.a. “Input Collision”: There exists \((i',j') < (i,j)\) (the block indexed by \((i',j')\) precedes (ij)) such that \(1 \le i \le q_e\), \( \mathsf {llcp}_{n}({i}) < j \le \ell ^i\) is not in the longest common prefix of the \( {i}^{\text {th}} \) query, and the (ij) block call has tweak-input collision with the \((i',j')\) block call, i.e., \(\textsf {T}^i_j = \textsf {T}^{i'}_{j'}\) and \(X^i_j \oplus \varDelta ^i_j = X^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\).

  • \(\mathsf {BadT'_2}\) a.k.a. “Output Collision”: There exists \((i',j') < (i,j)\) such that \(1 \le i \le q_e\), \( \mathsf {llcp}_{n}({i})< j < \ell ^i\) (not in the longest common prefix), and both the block calls have the same tweaks \(\textsf {T}^i_j = \textsf {T}^{i'}_{j'}\) and different inputs \(X^i_j \oplus \varDelta ^i_j \ne X^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\), however, one of the outputs collide i.e. one of the followings are true

    1. I.

      \(j = \mathsf {llcp}_{n}({i}) +1\) and (i) \(\varDelta ^i_{j+1} = \varDelta ^{i'}_{j' +1}\) if \(j < \ell ^i\) and \(j' < \ell ^{i'}\), or (ii) \(T^i = T^{i'}\) if \(j = \ell ^i\) and \(j' = \ell ^{i'}\).

    2. II.

      \(j > \mathsf {llcp}_{n}({i}) +1\) and (i) \(Y^i_j \oplus \varDelta ^i_j = Y^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\) if \(j > a^i + 1\), or (ii) \(\varDelta ^i_{j+1} = \varDelta ^{i'}_{j' +1}\) if \(j < \ell ^i\) and \(j' < \ell ^{i'}\), or (iii) \(T^i = T^{i'}\) if \(j = \ell ^i\) and \(j' = \ell ^{i'}\).

      (Note that such a collision cannot occur in the real-conf world where the masks and the tags are generated with a permutation).

We let \(\mathcal {T}'_{\text {bad}}\) be the set of “bad” transcripts defined as the subset of attainable transcripts for which the transcript predicate \(\mathsf {BadT'}(\tau ) = (\mathsf {BadT'_1}(\tau ) \vee \mathsf {BadT'_2}(\tau ))\) evaluates true. We define \(\mathcal {T}'_{\text {good}}\) as the set of attainable transcripts which are not in the set \(\mathcal {T}'_{\text {bad}}\) (and therefore from now on called as good transcripts).

Lemma 4

For \(\mathcal {T}'_{\text {bad}}\) as defined above, we have \(\Pr [\varTheta _{idco} \in \mathcal {T}'_{\text {bad}}] \le \frac{3\cdot \sigma ^2}{2^{n+1}}~.\)

Proof

BadT\('_1\). For any transcript in \(\mathcal {T}'_{\text {bad}}\) with \(\mathsf {BadT'}_1\) set to 1, we know that there exists at least one pair of (ij) and \((i',j')\) such that \( \mathsf {llcp}_{n}({i}) <j\le \ell ^i\), \((i' , j') < (i, j)\) and \(\varDelta ^i_j\oplus \varDelta ^{i'}_{j'} = X^i_j \oplus X^{i'}_{j'}.\)

Note that for all \(i' < i\) and \(j = j' = \mathsf {llcp}_{n}({i}) +1\), we have \(\varDelta ^i_j = \varDelta ^{i'}_{j'}\) but \(X^i_j \ne X^{i'}_{j'}\) and thus for such cases the probability that the above equality occurs is 0 (one should notice that the case when we have only the nonce collision is also covered in here. This is because for all \(i'<i\) if \(N^i = N^{i'}\) then we have \(j=j'=1\) and \( \mathsf {llcp}_{n}({i}) =0\) which implies that \(\varDelta ^i_1 = \varDelta ^{i'}_1 = 0\) and \(X^i_1 \ne X^{i'}_1\)). On the other hand, for all \(i' < i\) and \(j' \ne j\) or \(j \ne \mathsf {llcp}_{n}({i}) +1\), the two masks are sampled uniformly and independently in \(\varTheta _{idco}\). Since there are total \(\sigma ' \le \sigma \) possible values of (ij) in a transcript, each having no more than \(\sigma ' \le \sigma \) possible values of \((i',j')\), we get \(\Pr [\mathsf {BadT'_1}(\varTheta _{idco})= 1] \le \frac{\sigma ^2}{2}\cdot \max \left\{ 0,\frac{1}{2^n}\right\} =\frac{\sigma ^2}{2^{n+1}}.\)

BadT\('_2\). Similarly, for any transcript in \(\mathcal {T}'_{\text {bad}}\) with \(\mathsf {BadT'}_2\) set to 1, we know that there exists at least one pair of blocks \((i',j') < (i,j)\) such that \( \mathsf {llcp}_{n}({i})<j<\ell ^i\) and one of the followings is true (with appropriate values of j)

  1. I.

    \(j = \mathsf {llcp}_{n}({i}) +1\) and (\(\varDelta ^i_{j+1} = \varDelta ^{i'}_{j' +1}\) or \(T^i = T^{i'}\)) (in this case, we can’t have \(Y^i_j+\varDelta ^i_j = Y^{i'}_{j'}+\varDelta ^{i'}_{j'}\) as by the definition of \( \mathsf {llcp}_{n}({i}) \), \(X^i_j \ne X^{i'}_{j'}\) implies that \(Y^i_j \ne Y^{i'}_{j'}\))

  2. II.

    \(j > \mathsf {llcp}_{n}({i}) +1\) and \((Y^i_j+\varDelta ^i_j = Y^{i'}_{j'}+\varDelta ^{i'}_{j'}\) or \(\varDelta ^i_{j+1} = \varDelta ^{i'}_{j' +1}\) or \(T^i = T^{i'})\).

Note that from the definition of the predicate \(\mathsf {BadT'}_2\) we have for any j, \(j+1 \ne \mathsf {llcp}_{n}({i}) +1\). This means that \(\varDelta ^i_{j+1}\) is distributed uniformly and independently of \(\varDelta ^{i'}_{j'+1}\), with a collision probability \(1/2^n\). Each tag is generated as n uniform bits, independent of all other tags, making them collide with probability \(1/2^n\). On the other hand, for each \(j > \mathsf {llcp}_{n}({i}) \), \(\varDelta ^i_{j}\) is distributed uniformly and independently of \(\varDelta ^{i'}_{j'}\), so the masked ciphertexts also collide with probability \(1/2^n\).

There are no more than \(\sigma \) possible values of (ij) in a transcript to cause a collision of masked ciphertext, each with no more than \(\sigma \) possible values of \((i',j')\), making for no more than \(\sigma ^2/2\) pairs. In addition, there are no more than \(\sigma - q_e\) valid values of (ij) for a \(\varDelta \) collision, each with no more that \(\sigma - q_e\) possible values of \((i',j')\), yielding no more than \((\sigma - q_e)^2/2\) pairs. Finally, there are \(q_e\) tags that can cause a collision with one another, yielding no more than \(q_e^2/2\) pairs. Thus we get \(\Pr [\mathsf {BadT'_2}(\varTheta _{idco}) = 1] \le \frac{2\cdot \sigma ^2}{2^{n+1}}~.\)

Hence, we obtain by the union bound that \(\Pr [\varTheta _{idco} \in \mathcal {T}'_{\text {bad}}] \le \frac{3\cdot \sigma ^2}{2^{n+1}}~.\)

Lemma 5

Let \(\tau \in \mathcal {T}'_{\text {good}}\) i.e. \(\tau \) is a good transcript. Then \(\frac{\Pr [\varTheta _{reco}=\tau ]}{\Pr [\varTheta _{idco}=\tau ]} \ge 1~.\)

Proof

Note that a good transcript has the following property. For each \((i',j') < (i,j)\) if (ij) is not in the longest common prefix of the two queries i.e. \( \mathsf {llcp}_{n}({i, i'})<j<\ell ^i\) and both \(\pi _0\) (resp. \(\pi _1\)) calls have same tweaks (i.e. \(\textsf {T}^i_j = \textsf {T}^{i'}_{j'}\)) then both blocks will always have different inputs (i.e., \(X^i_j \oplus \varDelta ^i_j \ne X^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\)) and different outputs (i.e., \(Y^i_j \oplus \varDelta ^i_j \ne Y^{i'}_{j'} \oplus \varDelta ^{i'}_{j'}\) and \(\varDelta ^i_{j+1} \ne \varDelta ^{i'}_{j'+1}\) respectively \(T^i \ne T^{i'}\)).

The probability to obtain a good transcript \(\tau \) in the real-conf and the ideal-conf worlds can now be computed. With a slight abuse of notation, we let \(\tau _{\varDelta }\) denote the marginal event of all \(\varDelta \) masks in the queries (as variables) being equal to the values in the transcript. With these notations, we have \(\Pr [\varTheta _{reco}=\tau | \varTheta _{reco,\varDelta }=\tau _{\varDelta }] \ge \Pr [\varTheta _{idco}=\tau | \varTheta _{idco,\varDelta }=\tau _{\varDelta }]~. \) This is true because for fixed and unique (upto common prefix) input-output pairs (excluding the tags), the left side of this inequality corresponds to mappings of a random permutation with input size of n bits whereas the right side of this inequality corresponds to mappings of a random online permutation with input size of \(\ge n\)-bits. Similarly for the tags (fixed and unique), the left side of this inequality corresponds to a random permutation whereas the right side of this inequality corresponds to a random function both with same input size (n-bits).

Let us now consider that \(\tau \) has in total \(\delta '\) \(\varDelta \)s that are fixed/predefined due to all internal common prefixes. Clearly, one can write that \(\delta ' = \sum ^{q_e}_{i=1}( \mathsf {llcp}_{n}({i}) +1)\) (the extra 1 here stands for the \(\varDelta ^i_1\) which is always fixed to 0). In the ideal-conf world, since the \(\varDelta \)s corresponding to these \(\sigma '-\delta '\) unique block calls are sampled uniformly and independently, one has \( \Pr [\varTheta _{idco,\varDelta }=\tau _{\varDelta }] = \frac{1}{(2^n)^{\sigma '-\delta '}} \)

(Note that the sampling of the tags is already covered with the inequality defined above and is independent of the sampling of \(\varDelta \)s here).

In the real-conf world, these \(\sigma '-\delta '\) \(\varDelta \)s are no longer uniformly distributed but are defined using the random tweakable permutation \((\pi _0,\pi _1)\) with at least \(g'_1=\sum _{i=1}^{q_e}(a^i-1)\) block calls with the tweak \(0^n\) and at least \(g'_2=\sum _{i=1}^{q_e}(m^i-1)\) block calls with the tweak \(0^{n-1}\Vert 1\). Thus, one has

$$ \Pr [\varTheta _{reco,\varDelta }=\tau _{\varDelta }] \ge \frac{1}{(2^n)_{g'_1}(2^n)_{g'_2}(2^n)^{\sigma '-\delta '-g'_1-g'_2}}~. $$

Now, from the above three expressions, we get

$$ \frac{\Pr [\varTheta _{reco}=\tau ]}{\Pr [\varTheta _{idco}=\tau ]} \ge \frac{(2^n)^{\sigma '-\delta '}}{(2^n)_{g'_1}(2^n)_{g'_2}(2^n)^{\sigma '-\delta '-g'_1-g'_2}} = \frac{(2^n)^{g'_1}(2^n)^{g'_2}}{(2^n)_{g'_1}(2^n)_{g'_2}} \ge 1~. $$

Combining the results of Lemma 4 and 5 (taking \(\epsilon _1=0\)) into Lemma 1, we obtain the upper bound \({\mathbf {Adv}}_{\text {SAEF}[(\pi _0,\pi _1)]}^{ \mathbf {oprpf} }( {\mathcal {A}} ) \le \frac{3\cdot \sigma ^2}{2^{n+1}}\)

and thus the confidentiality result of the Theorem 1.

4 Conclusion

We prove that SAEF is OAE-secure w.r.t. both confidentiality and integrity as long as the total amount of data processed with a single key is \(\ll 2^{n/2}\) blocks, with n the blocksize of the underling forkcipher. This means that SAEF offers qualitatively stronger guarantees than what has been advertised in the original submission at the same quantitative security levels. Moreover, the newly discovered security properties of SAEF are highly relevant to many resource-constrained applications of lightweight cryptography, as discussed in Sect. 1. At the same time, SAEF can be implemented very efficiently with an efficient forkcipher instance such as ForkSKINNY [23], likely outperforming COLM (the de facto state of the art OAE-secure scheme) instantiated with SKINNY [4] tweakable blockcipher (which gives the most accurate comparison due to similarities with ForkSKINNY).Footnote 2 This makes SAEF a much more attractive implementation target than indicated by the previous results, especially for the most constrained devices.

An interesting avenue for future work is investigating the security of SAEF under the release of unverified plaintext (RUP). RUP security is also a natural target in the uses cases for lightweight cryptography, where the same constraints as discussed in Sect. 1 imply that constrained devices may be forced to leak portions of yet-unverified plaintext when decrypting long messages, complementing the blockcwise security of encryption implied by OAE. Our conjecture is that SAEF is INT-RUP secure without any modification.