Variants of Wegman-Carter message authentication code supporting variable tag lengths

Ghosh, Sebati; Sarkar, Palash

doi:10.1007/s10623-020-00840-w

Variants of Wegman-Carter message authentication code supporting variable tag lengths

Published: 04 February 2021

Volume 89, pages 709–736, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Designs, Codes and Cryptography Aims and scope Submit manuscript

Variants of Wegman-Carter message authentication code supporting variable tag lengths

Download PDF

351 Accesses
2 Citations
Explore all metrics

Abstract

In this work, we study message authentication code (MAC) schemes supporting variable tag lengths. We provide a formalisation of such a scheme. Several variants of the classical Wegman-Carter MAC scheme are considered. Most of these are shown to be insecure by pointing out detailed attacks. One of these schemes is highlighted and proved to be secure. We further build on this scheme to obtain single-key variable tag length MAC schemes utilising either a stream cipher or a short-output pseudo-random function. These schemes can be efficiently instantiated using practical well known primitives.

Secure Message Authentication Against Related-Key Attack

Non-adaptive Group-Testing Aggregate MAC Scheme

Full PRF-Secure Message Authentication Code Based on Tweakable Block Cipher

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Message authentication code (MAC) is the cryptographic mechanism to ensure the authenticity of messages transmitted across a public channel. A MAC scheme typically appends a short length tag to the message which is then transmitted. At the receiving end, a verification algorithm is run on the message-tag pair to confirm the authenticity. In such a set-up, the sender and the receiver share a previously agreed upon secret key.

Most MAC schemes specify a single value for the tag length. The question that we address in this work is the following. Is it possible to have MAC schemes where the tag length can vary? While the question seems to be a natural one, there does not appear to have been much discussion about this issue in the literature. The only material we could locate is an almost 15-year old CFRG [26] discussion pertaining to different tag lengths suggested for the MAC scheme UMAC [15]. This scheme had the possibility of using 32-bit, 64-bit, 96-bit and 128-bit tags. Finney [12], crediting “Dan Bernstein’s poly1305-aes mailing list”, had pointed out that this feature would allow forging a 64-bit tag using about $2^{33}$ queries. A later post [13] explains the issue further and suggests how a valid 128-bit tag can be obtained with only about $2^{34}$ queries. Wagner [27] supporting the issue raised by Finney, had mentioned that to fix the problem “it suffices to ensure that the tag length is a parameter that is immutably bound to the key and never changed. In other words, never use the same key with different parameter sizes.” Following this suggestion, Section 6.5 of the UMAC specification [15] states that a “UMAC key (or session) must have an associated and immutable tag length”. Another suggestion put forward by Finney [13] to handle the issue requires “stealing two bits of input into the block cipher from the nonce and using them to encode tag size”. Apart from the interesting discussion on variable tag lengths for the UMAC scheme, we know of no other place where the issue of variable tag length MAC schemes has been considered.

The question of variable tag length received some attention in the past few years in the context of authenticated encryption (AE) schemes and the CAESAR [9] competition. Manger [18] pointed out that for the AE scheme OCB, 64-bit, 96-bit and 128-bit tags are defined where the “64-bit and 96-bit tags are simply truncated 128-bit tags”. This leads to simple truncation attacks on the scheme. An earlier paper by Rogaway and Wagner [23] had also discussed the problem of variable tag lengths in the context of the AE scheme CCM. A formal treatment of variable tag length AE schemes has been given by Reyhanitabar, Vaudenay and Vizár [22].

Two concrete motivations are provided in [22] as to why a variable tag length AE scheme may indeed be desirable in practice. The first mentions that variable tag lengths may be used with the same key due to “misuse and poorly engineered security systems”. The second reason is that for resource constrained devices, variable tag lengths may be desirable though changing the key for every tag length may be infeasible due to limited bandwidth and low power.

While the above two reasons have been put forward in the context of AE schemes, they are equally valid for MAC schemes. More generally, the issue of “mis-implementation” (also called “footguns”) [21] of cryptographic primitives has been extensively discussed as part of the discussion forum on post-quantum cryptography.

More concretely, Auth256 [7] is a Wegman-Carter type construction targeted at the 256-bit security level. Similarly, a 256-bit secure universal hash function has been proposed in [10], which can be mated to a 256-bit secure PRF using the Wegman-Carter template to obtain a 256-bit secure MAC. Such MAC schemes would be appropriate for high-security applications, or, for a post-quantum world. On the other hand, bandwidth limited applications would require shorter tags. Also, the possibility of mis-implementation using tag truncation remains. So, the question of designing a MAC scheme which can support various tag lengths up to 256 bits is of practical interest.

To summarise, the problem of variable tag length MAC schemes has been briefly mentioned about 15 years ago. Since then, there has neither been any formal treatment of the topic and nor has there been any variable tag length MAC scheme which is accompanied by a proof of security. The problem of constructing such MAC schemes, though, is of contemporary and future practical interest.

1.1 Our contributions

We provide a formalisation of the notion of security for a variable tag length MAC scheme. For the same key, the desired tag length is to be provided as part of the input to the tag generation algorithm. Consequently, in the security model, we allow the adversary to control the tag length as well as the message. This is an extension of the usual security model for MAC schemes.

We consider the problem of obtaining secure variable tag length MAC schemes. The Wegman-Carter [29] scheme is the classical nonce-based MAC scheme. A naive approach to obtain a variable tag length MAC scheme is to truncate tags produced by the Wegman-Carter scheme. We show an easy attack on such a truncation scheme. Next, we consider eight possible “natural” variants that arise from the Wegman-Carter MAC scheme. We show attacks on six of these schemes. These attacks do not repeat nonces for tag generation queries. Among the attacked schemes is the scheme obtained by nonce stealing following the suggestion of Finney [12] as mentioned above. One of the eight schemes is generically secure since it uses independent keys for different tag lengths. The last of the eight schemes is proved to be secure. This scheme uses nonce stealing but, for different tag lengths, it uses independent keys for the universal hash function component of the Wegman-Carter scheme.

From a practical point of view, it is desirable to have a scheme which uses a single key. The key for the hash function is then derived from the key of the scheme and the tag length. The manner in which such derivation is made depends upon the primitive used to derive the hash key. We show two methods of deriving the hash key. The first method uses a stream cipher while the second method uses a short output length pseudo-random function (PRF). So, in effect, we obtain two constructions of single key variable tag length MAC scheme.

All the schemes that we describe can be instantiated by readily available concrete cryptographic primitives. For example, either of the 256-bit secure universal hash functions in [7, 10] can be combined with Salsa20 [3] to obtain nonce-based MAC schemes supporting variable tag lengths up to 256 bits. So, our work provides templates for designing efficient and practical MAC schemes which support variable tag lengths.

1.2 Previous and related works

The notion of MAC is several decades old. So, there is an extensive literature on this topic. Here we mention the papers which are directly related to our work.

The Wegman-Carter [29] scheme is four decades old. Several important and practical MAC schemes, such as UMAC [8] and Poly1305 [4] are based on the Wegman-Carter scheme. From a theoretical point of view, the security of the Wegman-Carter scheme was later analysed by Shoup [25] and Bernstein [5]. Recently, the optimality of Bernstein’s bound was established in [17, 20].

The point that tag lengths can vary depending on the application has been noted in [24] where the problem of determining an economically optimal tag length has been considered from a game theoretic point of view. This is completely different from the work reported in the present paper.

Relation to the work of Reyhanitabar et al. [22]: The notion of authenticated encryption with associated data (AEAD) which can support variable tag lengths was introduced in [22]. An AEAD scheme has two algorithms, namely encryption and decryption. The encryption algorithm takes as input a nonce, a plaintext, an associated data and a tag length and returns the corresponding ciphertext; while the decryption algorithm takes as input a nonce, a ciphertext, an associated data and a tag length and either returns $\bot $ indicating that the input is improper, or, returns the corresponding plaintext. Such an AEAD scheme can be considered to be a nonce-based MAC scheme where the plaintext is always fixed to the empty string and the message to be authenticated is provided as the associated data. With this modification, the formalisation of the authenticity of the AEAD scheme in [22] turns out to be the same as the formalisation of the variable tag length nonce-based MAC scheme introduced in this work. The difference between our formalisation and that of [22] is in the treatment of adversarial resources. We have considered the notion of query profile, while the usual notion of query complexity has been considered in [22]. In terms of construction, the contribution of [22] is different from ours. A variant of OCB [16] is considered in [22], while we describe variants of the Wegman-Carter scheme.

2 Definitions

Let x be a binary string: ${\textsf {len}} (x)$ denotes the length of x; for a non-negative integer $\lambda $, ${\textsf {msb}} _{\lambda }(x)$ denotes the $\lambda $ most significant bits of x. Given an integer i in the range $0\le i<2^k-1$, ${\textsf {bin}} _k(i)$ denotes the k-bit binary representation of i.

Throughout this paper, n is a fixed positive integer.

2.1 Hash function

Let ${\mathcal {M}}$ and $\Theta $ be finite non-empty sets. Let $\{H_{\tau }\}_{\tau \in \Theta }$ be an indexed family of functions such that for each $\tau \in \Theta $, $H_{\tau }:{\mathcal {M}}\rightarrow \{0,1\}^n$. The sets ${\mathcal {M}}$ and $\Theta $ are respectively the message and the key spaces. Typically, a message is a binary string of some maximum length.

For distinct $x,x^{\prime }\in {\mathcal {M}}$ and any n-bit string y, the differential probability of $H_{\tau }$ for the triplet $(x,x^{\prime },y)$ is defined to be $\Pr _{\tau }[H_{\tau }(x)\oplus H_{\tau }(x^{\prime })=y]$, where the probability is taken over the uniform random choice of $\tau $ from $\Theta $. The differential probability may depend on the lengths of x and $x^{\prime }$. Suppose L is the maximum of the lengths of the binary strings in ${\mathcal {M}}$. Let $\varepsilon :\{0,\ldots ,L\}^2\rightarrow [0,1]$ be a function such that the differential probability for any $(x,x^{\prime },y)$ is at most $\varepsilon ({\textsf {len}} (x),{\textsf {len}} (x^{\prime }))$. Then the family $\{H_{\tau }\}_{\tau \in \Theta }$ is said to be $\varepsilon $-AXU.

2.2 Pseudo-random function

Let ${\mathcal {D}}$ and ${\mathcal {R}}$ be finite non-empty sets of binary strings and ${\mathcal {K}}$ be a finite non-empty set. Let $\{F_K\}_{K\in {\mathcal {K}}}$ be a keyed family of functions with $F_K:{\mathcal {D}}\rightarrow {\mathcal {R}}$. Informally speaking, the function family $\{F_K\}_{K\in {\mathcal {K}}}$ is considered to be pseudo-random if a resource limited adversary is unable to distinguish it from a uniform random function from ${\mathcal {D}}$ to ${\mathcal {R}}$. This is formalised in the following manner.

We consider an adversary ${\mathcal {A}}$ which has access to an oracle ${\mathcal {O}}$, which is written as ${\mathcal {A}}^{{\mathcal {O}}}$. ${\mathcal {A}}$ adaptively sends queries to ${\mathcal {O}}$ and receives appropriate responses. At the end of the interaction, ${\mathcal {A}}$ outputs a bit. The adversary is allowed to perform computations and also has access to private random bits.

Let $(K{\mathop {\leftarrow }\limits ^{\$}}{\mathcal {K}}:\mathcal{A}^{F_K(\cdot )}\Rightarrow 1)$ denote the event that K is chosen uniformly at random from ${\mathcal {K}}$ and the adversary produces 1 after interacting with the oracle $F_K(\cdot )$. Let $\$(\cdot )$ be a function chosen uniformly at random from the set of all functions from ${\mathcal {D}}$ to ${\mathcal {R}}$. Let $(\mathcal{A}^{\$(\cdot )}\Rightarrow 1)$ denote the event that the adversary produces 1 after interacting with the oracle $\$(\cdot )$.

The advantage of ${\mathcal {A}}$ in breaking the pseudo-randomness of $\{F_K\}_{K\in {\mathcal {K}}}$ is defined as follows.

$$\begin{aligned} {\textsf {Adv}} ^\mathrm{\small prf}_{F}({{{\mathcal {A}}}})= \Pr \left[ K{\mathop {\leftarrow }\limits ^{\$}}{\mathcal {K}}:\mathcal{A}^{F_K(\cdot )}\Rightarrow 1\right] -\Pr \left[ \mathcal{A}^{\$(\cdot )}\Rightarrow 1\right] . \end{aligned}$$

(1)

The probabilities are over the randomness of ${\mathcal {A}}$, the choice of K and the randomness of $\$(\cdot )$.

Suppose that ${\mathcal {A}}$ makes a total of q queries sending a total of $\sigma $ bits in all the queries. By ${\textsf {Adv}} ^\mathrm{\small prf}_{F}(t,q,\sigma )$ we will denote the maximum advantage of any adversary taking time at most t, making at most q queries and sending at most $\sigma $ bits in all its queries.

2.3 Variable tag length nonce-based message authentication code

A MAC scheme has two algorithms, namely, the tag generation algorithm and the verification algorithm. Typically, in a MAC scheme, tags are binary strings of some fixed length. The definition of MAC schemes, however, does not require tags to have the same length. So, it is possible to consider variable length tags within the ambit of the currently used definition of MAC schemes.

Our goal, on the other hand, is different. We would like the tag length to be provided as part of the input to the tag generation and verification algorithms. So, for the same message, by providing different values of the tag length, it is possible to generate tags of different lengths. This feature is not covered by the presently used definition of MAC schemes. We extend the syntax of MAC schemes and the definition of security to incorporate this feature.

A nonce-based MAC scheme is given by the message space ${\mathcal {M}}$, the nonce space ${\mathcal {N}}$, the key space ${\mathcal {K}}$, the allowed set ${\mathcal {L}}$ of tag lengths, the tag space ${\mathcal {T}}$; and two algorithms ${\textsf {nvMAC}} .{\textsf {Gen}} (K,N,x,\lambda )$ and ${\textsf {nvMAC}} .{\textsf {Verify}} (K,N,x,{\textsf {tag}} ,\lambda )$, where $K\in {\mathcal {K}}$, $N\in {\mathcal {N}}$, $x\in {\mathcal {M}}$, $\lambda \in {\mathcal {L}}$ and ${\textsf {tag}} \in {\mathcal {T}}$. We consider ${\mathcal {M}}$, ${\mathcal {N}}$, ${\mathcal {K}}$ and ${\mathcal {L}}$ to be finite non-empty sets and ${\mathcal {T}}$ to be equal to $\cup _{i\in {\mathcal {L}}}\{0,1\}^i$. We write ${\textsf {nvMAC}} .{\textsf {Gen}} _K(N, x,\lambda )$ to denote ${\textsf {nvMAC}} .{\textsf {Gen}} (K,N,x,\lambda )$, and similarly ${\textsf {nvMAC}} .{\textsf {Verify}} _K(N,x,{\textsf {tag}} ,\lambda )$ to denote ${\textsf {nvMAC}} .{\textsf {Verify}} (K,N,x,{\textsf {tag}} ,\lambda )$.

The inputs and outputs of ${\textsf {nvMAC}} .{\textsf {Gen}} _K(N,x,\lambda )$ and ${\textsf {nvMAC}} .{\textsf {Verify}} _K(N,x,{\textsf {tag}} ,\lambda )$ are as follows.

${\textsf {nvMAC}} .{\textsf {Gen}} _K(N,x,\lambda )$:

input: $K\in {\mathcal {K}}$; $N\in {\mathcal {N}}$; $x\in {\mathcal {M}}$; and $\lambda \in {\mathcal {L}}$.

output: ${\textsf {tag}} \in {\mathcal {T}}$ is a binary string of length $\lambda $.
${\textsf {nvMAC}} .{\textsf {Verify}} _K(N,x,{\textsf {tag}} ,\lambda )$:

input: $K\in {\mathcal {K}}$; $N\in {\mathcal {N}}$; $x\in {\mathcal {M}}$; ${\textsf {tag}} \in {\mathcal {T}}$; and $\lambda \in {\mathcal {L}}$ such that ${\textsf {tag}} $ is of length $\lambda $.

output: an element from the set $\{{\textsf {true}} ,{\textsf {false}} \}$. The value ${\textsf {true}} $ indicates that the input is accepted while the value false indicates that the input is rejected.

The following correctness condition must hold.

$$\begin{aligned} {\textsf {nvMAC}} .{\textsf {Verify}} _K(N,x,{\textsf {nvMAC}} .{\textsf {Gen}} _K(N,x,\lambda ),\lambda )= & {} {\textsf {true}} . \end{aligned}$$

Security: The security for a (nonce-based) MAC scheme against an adversary ${\mathcal {A}}$ is modelled as follows. Suppose K is chosen uniformly at random from ${\mathcal {K}}$ and the tag generation and verification algorithms are instantiated with K. ${\mathcal {A}}$ is given oracle access to the tag generation and the verification algorithms. ${\mathcal {A}}$ makes a total of $q_g$ queries to the tag generation oracle and a total of $q_v$ queries to the verification oracle. The queries are made adaptively and queries to the tag generation oracle can be interleaved with those to the verification oracle.

Let the queries to the tag generation oracle be

$$\begin{aligned} \left( N_g^{(1)},x_g^{(1)},\lambda _g^{(1)}\right) ,\ldots ,\left( N_g^{(q_g)},x_g^{(q_g)},\lambda _g^{(q_g)}\right) \end{aligned}$$

and the corresponding responses be ${\textsf {tag}} _g^{(1)},\ldots ,{\textsf {tag}} _g^{(q_g)}$ respectively. Similarly, let the queries to the verification oracle be

$$\begin{aligned} \left( N_v^{(1)},x_v^{(1)},{\textsf {tag}} _v^{(1)},\lambda _v^{(1)}\right) ,\ldots , \left( N_v^{(q_v)},x_v^{(q_v)},{\textsf {tag}} _v^{(q_v)},\lambda _v^{(q_v)}\right) \end{aligned}$$

and the corresponding responses be ${\textsf {xxx}} _v^{(1)},\ldots ,{\textsf {xxx}} _v^{(q_v)}$ respectively, where for $1\le j\le q_v$, ${\textsf {xxx}} _v^{(j)}$ is either true or false. The query profile of ${\mathcal {A}}$ is the list

$$\begin{aligned} {\mathfrak {C}}= & {} (q_g, q_v, ({\mathfrak {n}}_g^{(1)},{\mathfrak {m}}_g^{(1)},\lambda _g^{(1)}),\ldots ,({\mathfrak {n}}_g^{(q_g)},{\mathfrak {m}}_g^{(q_g)},\lambda _g^{(q_g)}), ({\mathfrak {n}}_v^{(1)},{\mathfrak {m}}_v^{(1)},\lambda _v^{(1)}), \nonumber \\&\ldots ,({\mathfrak {n}}_v^{(q_v)},{\mathfrak {m}}_v^{(q_v)},\lambda _v^{(q_v)})) \end{aligned}$$

(2)

where for $1\le s\le q_g$, ${\mathfrak {n}}_g^{(s)}={\textsf {len}} (N_g^{(s)}), {\mathfrak {m}}_g^{(s)}={\textsf {len}} (x_g^{(s)})$ and for $1\le s\le q_v$, ${\mathfrak {n}}_v^{(s)}={\textsf {len}} (N_v^{(s)}), {\mathfrak {m}}_v^{(s)}={\textsf {len}} (x_v^{(s)})$.

There are two restrictions on the adversary. The first is a weaker form of nonce-respecting behaviour, namely, $\left( N_g^{(i)},\lambda _g^{(i)}\right) \ne \left( N_g^{(j)},\lambda _g^{(j)}\right) $ for $1\le i<j\le q_g$. Note that the adversary is allowed to repeat (nonce, tag-length) pair for verification queries and it is also allowed to re-use a (nonce, tag-length) pair used in a tag generation query in one or more verification queries. Usual nonce-respecting behaviour requires the nonces in the tag generation queries to be distinct. By relaxing this condition, we provide the adversary with more power. So, a scheme proved secure against the weaker form of nonce-respecting behaviour maintains security even if nonces are repeated in tag generation queries as long as the (nonce, tag-length) pairs are distinct. The second restriction on the adversary is that it should not make any useless query. A query is useless if its response can be computed by the adversary. This means that the adversary should not repeat a query to the tag generation oracle or the verification oracle; and it should not query the verification oracle with $\left( N_g^{(i)},x_g^{(i)},{\textsf {tag}} _g^{(i)},\lambda _g^{(i)}\right) $ for any i in $\{1,\ldots ,q_g\}$.

The adversary makes a number of verification queries. The tag lengths of these queries could be different. There is no restriction on the adversary to choose a target tag length before making the queries to its oracles. For any tag length $\lambda $, the adversary is successful if a verification query for this tag length returns ${\textsf {true}} $. So, for any value of the tag length, there is a corresponding event that the adversary is successful for a particular tag length. Formally, for $\lambda \in {{{\mathcal {L}}}}$, let ${\textsf {succ}} _{{\mathcal {A}}}(\lambda )$ be the event that there is some $j\in \{1,\ldots ,q_v\}$ such that $\lambda _v^{(j)}=\lambda $ and ${\textsf {nvMAC}} .{\textsf {Verify}} _K\left( N_v^{(j)},x_v^{(j)},{\textsf {tag}} _v^{(j)},\lambda _v^{(j)}\right) $ returns ${\textsf {true}} $. For each $\lambda \in {\mathcal {L}}$, the adversary’s advantage in breaking the authenticity of ${\textsf {nvMAC}} $ is defined to be $\Pr [{\textsf {succ}} _{{\mathcal {A}}}(\lambda )]$. This is written as follows.

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ]({\mathcal {A}})= & {} \Pr \left[ {\textsf {succ}} _{{\mathcal {A}}}(\lambda )\right] . \end{aligned}$$

(3)

The above probability is taken over the uniform random choice of K from ${\mathcal {K}}$ and over the possible internal randomness of the adversary ${\mathcal {A}}$.

Given a query profile ${\mathfrak {C}}$, ${\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ](t,{\mathfrak {C}})$ is the maximum of ${\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ]({\mathcal {A}})$ taken over all adversaries running in time t and having query profile ${\mathfrak {C}}$.

Remark

The security model allows nonces to be repeated with different tag lengths. As explained above, this provides the adversary with more power. We further note that allowing nonces to be reused with different tag lengths permits generation of fewer nonces which may be of interest in some resource-constrained applications. At this point though, we are unable to provide a concrete example.

Security in terms of query complexity: The query complexity is the total number of bits sent by the adversary in all its queries. For tag generation queries, this consists of the number of bits sent as part of the nonces, the messages and the $\lambda _g$’s; for verification queries, this consists of the number of bits sent as part of the nonces, the messages, the tags and the $\lambda _v$’s. Let the $q_g$ tag generation queries require a total of $\sigma _g$ bits and the $q_v$ verification queries require a total of $\sigma _v$ bits. So, $\sigma _g = \sum _{1\le i \le q_g}({\textsf {len}} (N_g^{(i)})+{\textsf {len}} (x_g^{(i)})+{\textsf {len}} (\lambda _g^{(i)}))= \sum _{1\le i \le q_g}({\mathfrak {n}}_g^{(i)}+{\mathfrak {m}}_g^{(i)}+{\textsf {len}} (\lambda _g^{(i)}))$ and $\sigma _v = \sum _{1\le i \le q_v}({\textsf {len}} (N_v^{(i)})+{\textsf {len}} (x_v^{(i)})+{\textsf {len}} ({\textsf {tag}} _v^{(i)})+{\textsf {len}} (\lambda _v^{(i)}))= \sum _{1\le i \le q_v}({\mathfrak {n}}_v^{(i)}+{\mathfrak {m}}_v^{(i)}+\lambda _v^{(i)}+{\textsf {len}} (\lambda _v^{(i)}))$, as ${\textsf {len}} ({\textsf {tag}} _v^{(i)}) = \lambda _v^{(i)}$. If the elements of ${\mathcal {L}}$ are expressed as ${\mathfrak {t}}$-bit binary strings, then $\sigma _g = \sum _{1\le i \le q_g}({\mathfrak {n}}_g^{(i)}+{\mathfrak {m}}_g^{(i)})+q_g{\mathfrak {t}}$ and $\sigma _v = \sum _{1\le i \le q_v}({\mathfrak {n}}_v^{(i)}+{\mathfrak {m}}_v^{(i)}+\lambda _v^{(i)})+q_v{\mathfrak {t}}$. Given query complexity $(\sigma _g, \sigma _v)$, ${\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ](t,\sigma _g, \sigma _v)$ is the maximum of ${\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ]({\mathcal {A}})$ taken over all adversaries ${\mathcal {A}}$ running in time t and having query complexity $(\sigma _g, \sigma _v)$.

Given a query profile ${\mathfrak {C}}$ of any adversary ${\mathcal {A}}$ the corresponding query complexity $(\sigma _g, \sigma _v)$ can be readily derived in the above manner. On the other hand, it is to be noted that, various query profiles can have the same query complexity. Hence, in the security definition above in terms of query complexity, when one maximises over query complexity, the value obtained is the maximum over all possible query profiles which have that same query complexity. This gives us the following proposition.

Proposition 1

Let us fix a query complexity $(\sigma _g, \sigma _v)$ and let be the set of all query profiles having query complexity $(\sigma _g, \sigma _v)$, i.e.,

Then,

(4)

Later we explain the rationale for considering query profiles.

Information theoretic security: This consists of analysing the security of a MAC scheme against a computationally unbounded adversary. In other words, the probability in (3) is considered for an adversary ${\mathcal {A}}$ without any reference to the run time of ${\mathcal {A}}$. For such a computationally unbounded adversary ${\mathcal {A}}$, without loss of generality, we may assume ${\mathcal {A}}$ to be deterministic. In the context of information theoretic security, given a query profile ${\mathfrak {C}}$, ${\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ]({\mathfrak {C}})$ is the maximum of ${\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda ]({\mathcal {A}})$ taken over all adversaries ${\mathcal {A}}$ having query profile ${\mathfrak {C}}$.

3 Towards building a variable tag length MAC

It may appear that a variable tag length nonce-based MAC scheme can be obtained simply by truncating the output of the Wegman-Carter MAC algorithm. This, however, does not work as we show in this section. We further consider several “natural” extensions of the Wegman-Carter MAC algorithm and show that most of them are insecure. Only two of these extensions are secure: one of them is a generic construction, while we prove the security of the other in the next section. Overall, the discussion in the present section may be considered as showing the subtlety involved in constructing a variable tag length nonce-based MAC scheme.

Let ${\mathcal {N}}$ be the nonce space and ${\mathcal {M}}$ be the message space. Let $\{{\textsf {F}} _K\}_{K\in {\mathcal {K}}}$ be a PRF such that ${\textsf {F}} _K:{\mathcal {N}}\rightarrow \{0,1\}^n$; let $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ be an AXU hash function such that ${\textsf {Hash}} _{\tau }:{\mathcal {M}}\rightarrow \{0,1\}^n$. Given $\{{\textsf {F}} _K\}_{K\in {\mathcal {K}}}$ and $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$, the Wegman-Carter MAC [29] is the following. A nonce-message pair (N, x) is mapped under a key $(K,\tau )$ to ${\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x)$, i.e.,

$$\begin{aligned} {\textsf {WC}} \text{- }{\textsf {nvMAC}} : (N,x) {\mathop {\longrightarrow }\limits ^{(K,\tau )}} {\textsf {F}} _K(N) \oplus {\textsf {Hash}} _{\tau }(x). \end{aligned}$$

(5)

Below we argue that several natural extensions of ${\textsf {WC}} \text{- }{\textsf {nvMAC}} $ are not secure. We assume that binary representation of tag lengths fit within a byte. The attacks are shown for the following specific choice of the hash function. Under a fixed representation of the elements of the finite field ${\mathbb {F}}_{2^n}$, we identify the elements of ${\mathbb {F}}_{2^n}$ with the set $\{0,1\}^n$. The specific hash function that we consider is ${\textsf {Hash}} _{\tau }(x)=\tau x$, i.e., the output of ${\textsf {Hash}} _{\tau }(x)$ is the n-bit string representing the product of $\tau $ and x considered as elements of ${\mathbb {F}}_{2^n}$. This hash function is known to be AXU. Attacks on schemes built using this specific hash function is sufficient to show that the schemes described below are not secure for an arbitrary AXU hash function. The choice of the hash function fixes the key space of the hash function to be $\Theta = {\mathbb {F}}_{2^n}$ and the message space ${\mathcal {M}}$ to be either ${\mathbb {F}}_{2^n}$ or ${\mathbb {F}}_{2^{n-8}}$, depending on the scheme.

We will use the following simple fact about the specific hash function that we consider.

Proposition 2

Consider the AXU hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in {\mathbb {F}}_{2^n}}$ where ${\textsf {Hash}} _{\tau }(x)$ $ = \tau x$. Let $x_1$ and $x_2$ be distinct elements of ${\mathbb {F}}_{2^n}$ and c be such that ${\textsf {Hash}} _{\tau }(x_1)\oplus {\textsf {Hash}} _{\tau }(x_2) = c$, then $\tau = c(x_1\oplus x_2)^{-1}$.

The most obvious approach to obtain a variable tag length scheme from (5) is to truncate the output, i.e.,

$$\begin{aligned} {\textsf {trunc}} :\ (N,x,\lambda ) {\mathop {\longrightarrow }\limits ^{(K,\tau )}} {\textsf {msb}} _{\lambda }({\textsf {WC}} \text{- }{\textsf {nvMAC}} _{K,\tau }(N,x)) = {\textsf {msb}} _\lambda ({\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x)). \end{aligned}$$

The scheme trunc is not secure as can be seen from the following attacks. Note that in this case the message space is ${\mathbb {F}}_{2^n}$.

Attack 1 on trunc: Let x be a message and N be a nonce. The adversary makes a tag generation query (N, x, n) and gets in response t. Now the adversary makes a verification query $(N,x,{\textsf {msb}} _{n-1}(t),n-1)$ and it is successful with probability 1. Thus the adversary makes a successful forgery with only one tag generation query.
Attack 2 on trunc: Another attack which repeats nonces in tag generation queries and reveals more information is the following. Let $x_1,x_2$ and $x_3$ be distinct messages and N be a nonce. The adversary makes two tag generation queries $(N,x_1,n)$ and $(N,x_2,n-1)$ and gets in response $t_1$ and $t_2$ respectively. So, we have the following relations: ${\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x_1) = t_1$ and ${\textsf {msb}} _{n-1}({{\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x_2)}) = t_2.$ From the second relation, it follows that either ${{\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x_2)} = t_2||0$ or ${{\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x_2)} = t_2||1$. Using Proposition 2, the adversary solves the equations ${\textsf {Hash}} _{\tau }(x_1) \oplus {\textsf {Hash}} _{\tau }(x_2) = t_1 \oplus (t_2||0)$ and ${\textsf {Hash}} _{\tau }(x_1) \oplus {\textsf {Hash}} _{\tau }(x_2) = t_1 \oplus (t_2||1)$ for $\tau $ to obtain the solutions $\tau _0$ and $\tau _1$ respectively. As ${{\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x_2)}$ takes exactly one of the two values $t_2||0$ or $t_2||1$, $\tau $ takes exactly one of the two values $\tau _0$ or $\tau _1$. Let $y_0 = t_1 \oplus {\textsf {Hash}} _{\tau _0}(x_1)$. The adversary makes a verification query $(N,x_3,y_0 \oplus {\textsf {Hash}} _{\tau _0}(x_3),n)$. If the verification query is successful then $\tau _0$ is the correct value of $\tau $. If the verification query fails, then $\tau _1$ is the correct value of $\tau $. Thus the adversary recovers the hash key with two tag generation and one verification queries.

The first attack shows that a simple truncation of the Wegman-Carter MAC scheme does not work while the second attack shows that by repeating nonces in tag generation queries the hash key can be obtained. One possibility of modifying ${\textsf {trunc}} $ is to apply ${\textsf {F}} _K$ a second time before applying truncation, i.e., the tag is obtained as ${\textsf {msb}} _\lambda ({\textsf {F}} _K({\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }(x))).$ The resulting scheme is also not secure. The first simple attack on ${\textsf {trunc}} $ also works for this modified scheme.

In the scheme ${\textsf {trunc}} $, the output of neither ${\textsf {F}} $ nor ${\textsf {Hash}} $ depends on $\lambda $. To rectify this situation, one may introduce $\lambda $ as part of the input of one or both of ${\textsf {F}} $ and ${\textsf {Hash}} $. Another possibility is to have one or both of the keys K and $\tau $ to depend on $\lambda $. Key dependencies are achieved by using a family of independent keys $\{K_{\lambda }\}_{\lambda \in {\mathcal {L}}}$ and/or a family of independent keys $\{\tau _{\lambda }\}_{\lambda \in {\mathcal {L}}}$. The various schemes that arise from such considerations are as follows.

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 1_{K,\tau }:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K,\tau )}}&{\textsf {msb}} _\lambda ({\textsf {F}} _K({\textsf {bin}} _8(\lambda )||N)\oplus {\textsf {Hash}} _{\tau }(x)). \end{aligned}$$

(6)

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 2_{K,\tau }:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K,\tau )}}&{\textsf {msb}} _\lambda ({\textsf {F}} _K(N)\oplus {\textsf {Hash}} _{\tau }({\textsf {bin}} _8(\lambda )||x)). \end{aligned}$$

(7)

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 3_{K,\tau }:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K,\tau )}}&{\textsf {msb}} _\lambda ({\textsf {F}} _K({\textsf {bin}} _8(\lambda )||N)\oplus {\textsf {Hash}} _{\tau }({\textsf {bin}} _8(\lambda )||x)).\nonumber \\ \end{aligned}$$

(8)

$$\begin{aligned} {\textsf {nvMAC\text{- }Generic}} _{(K_{\lambda },\tau _{\lambda })_{\lambda \in {\mathcal {L}}}}:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K_{\lambda },\tau _{\lambda })}}&{\textsf {msb}} _\lambda ({\textsf {F}} _{K_{\lambda }}(N)\oplus {\textsf {Hash}} _{\tau _{\lambda }}(x)). \end{aligned}$$

(9)

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 4_{(K_{\lambda },\tau )_{\lambda \in {\mathcal {L}}}}:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K_{\lambda },\tau )}}&{\textsf {msb}} _\lambda ({\textsf {F}} _{K_{\lambda }}(N)\oplus {\textsf {Hash}} _{\tau }(x)). \end{aligned}$$

(10)

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 5_{(K_{\lambda },\tau )_{\lambda \in {\mathcal {L}}}}:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K_{\lambda },\tau )}}&{\textsf {msb}} _\lambda ({\textsf {F}} _{K_{\lambda }}(N)\oplus {\textsf {Hash}} _{\tau }({\textsf {bin}} _8(\lambda )||x)). \end{aligned}$$

(11)

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 6_{(K,\tau _{\lambda })_{\lambda \in {\mathcal {L}}}}:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K,\tau _{\lambda })}}&{\textsf {msb}} _\lambda ({\textsf {F}} _{K}(N)\oplus {\textsf {Hash}} _{\tau _{\lambda }}(x)). \end{aligned}$$

(12)

$$\begin{aligned} {\textsf {nvMAC}} _{(K,\tau _{\lambda })_{\lambda \in {\mathcal {L}}}}:\ (N,x,\lambda )&{\mathop {\longrightarrow }\limits ^{(K,\tau _{\lambda })}}&{\textsf {msb}} _\lambda ({\textsf {F}} _{K}({\textsf {bin}} _8(\lambda )||N)\oplus {\textsf {Hash}} _{\tau _{\lambda }}(x)). \end{aligned}$$

(13)

Dependencies of input and/or key on $\lambda $ for the above schemes are summarised in Table 1.

Table 1 For the schemes in (6) to (13), a summary of whether the input and/or the key of ${\textsf {F}} $ and/or ${\textsf {Hash}} $ depend on the tag length $\lambda $

Full size table

Nonce stealing: Finney [12] had suggested that the nonce may be reduced by a few bits and a binary encoding of the tag length be inserted. In the present context, this refers to letting the input of ${\textsf {F}} $ depend on the tag length. From Table 1, we see that the schemes ${\textsf {nvMAC}} $-${\textsf {t}} 1$, ${\textsf {nvMAC}} $-${\textsf {t}} 3$ and ${\textsf {nvMAC}} $ use nonce stealing. While ${\textsf {nvMAC}} $ is secure (as proved later), schemes ${\textsf {nvMAC}} $-${\textsf {t}} 1$ and ${\textsf {nvMAC}} $-${\textsf {t}} 3$ are insecure. So, nonce stealing by itself does not guarantee security.

For the ensuing discussion, we will consider the message space for the schemes ${\textsf {nvMAC}} \text{- }{\textsf {t}} 1$, ${\textsf {nvMAC}} $-${\textsf {Generic}} $, ${\textsf {nvMAC}} \text{- }{\textsf {t}} 4$ and ${\textsf {nvMAC}} \text{- }{\textsf {t}} 6$ to be ${\mathbb {F}}_{2^n}$, and that for the schemes ${\textsf {nvMAC}} \text{- }{\textsf {t}} 2$, ${\textsf {nvMAC}} \text{- }{\textsf {t}} 3$ and ${\textsf {nvMAC}} \text{- }{\textsf {t}} 5$ to be ${\mathbb {F}}_{2^{n-8}}$.

Algorithm 1 describes an attack on ${\textsf {nvMAC\text{- }t}} 1$ which uses ${\textsf {findTag}} $ as a subroutine. In the attack, the tag generation and verification oracles are denoted by ${\mathcal {O}}_g$ and ${\mathcal {O}}_v$ respectively. On being supplied with input $(N,x,\lambda )$, the function ${\textsf {findTag}} (N,x,\lambda )$ finds ${\textsf {tag}} $ such that $(N,x,{\textsf {tag}} ,\lambda )$ passes the test by the verification oracle. To do this, ${\textsf {findTag}} $ repeatedly queries the verification oracle, until a suitable ${\textsf {tag}} $ is obtained. The expected number of queries made by ${\textsf {findTag}} (N,x,\lambda )$ is $2^{\lambda }$. Algorithm 1 invokes ${\textsf {findTag}} $ with values of the tag length which are less than the target tag length.

The intuition behind the attack in Algorithm 1 is the following. The key $(K,\tau )$ of the scheme does not depend on $\lambda $. In particular, as the hash key $\tau $ does not depend on $\lambda $, the attack retrieves $\tau $ using a smaller value of $\lambda $ and uses it for the forgery with the target $\lambda $ successfully. Retrieving $\tau $ using a smaller value of $\lambda $ requires significantly less number of oracle queries than that required for an attack by exhaustive search for the target $\lambda $. The analysis of the attack is given in Proposition 3. This divide-and-conquer attack strategy of using shorter tag length to learn information, with low cost, which is useful for longer tag lengths has previously been used in the context of AE [11, 22].

Proposition 3

The attack given in Algorithm 1 on the scheme ${\textsf {nvMAC\text{- }t}} 1$ given in (6) produces a forgery for tag length $\lambda $ which is correct with probability 1. It requires one tag generation query and at most $2^{\lambda _1} + 2^{n-\lambda _1}$ verification queries on tag length $\lambda _1$ and one tag generation query and one verification query on tag length $\lambda $.

Proof

That the attack mentioned in Algorithm 1 forges with probability 1 is proved if it can be shown that the forgery returned by the attack in Step 17 is accepted, i.e. the corresponding response from ${\mathcal {O}}_v$ is true.

From Step 4 we get,

$$\begin{aligned} {\textsf {msb}} _{\lambda _1}({\textsf {F}} _K({\textsf {bin}} _8(\lambda _1)||N_1)\oplus {\textsf {Hash}} _{\tau }(x_1)) = {\textsf {tag}} ^{(1)}. \end{aligned}$$

(14)

The ${\textsf {tag}} ^{(2)}$ returned by Step 5 satisfies

$$\begin{aligned} {\textsf {msb}} _{\lambda _1}({\textsf {F}} _K({\textsf {bin}} _8(\lambda _1)||N_1)\oplus {\textsf {Hash}} _{\tau }(x_2)) = {\textsf {tag}} ^{(2)}. \end{aligned}$$

(15)

So, from (14) and (15) we get,

$$\begin{aligned} {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau }(x_1)\oplus {\textsf {Hash}} _{\tau }(x_2)) = {\textsf {tag}} ^{(1)} \oplus {\textsf {tag}} ^{(2)}. \end{aligned}$$

(16)

Here ${\textsf {tag}} ^{(1)} \oplus {\textsf {tag}} ^{(2)}$ is a $\lambda _1$-bit binary string. Following Proposition 2, for each choice of c in the do-while loop in Steps 7 to 14, the equation in Step 10 can be solved to get $\tau _c$ and $x_c$.

The fact that ${\textsf {Hash}} _{\tau }(x_1)\oplus {\textsf {Hash}} _{\tau }(x_2) \in \{0,1\}^n$ and (16) suggest that there is a correct c, such that the equation in Step 10 holds and we consider that iteration of the do-while loop which deals with this particular c. The $\tau _c$ obtained in this iteration is the actual hash key used in the scheme. So,

$$\begin{aligned}&{\textsf {nvMAC}} \text{- }{\textsf {t}} 1(N_1,x_3,\lambda _1) \nonumber \\&\quad = {\textsf {msb}} _{\lambda _1}({\textsf {F}} _K({\textsf {bin}} _8(\lambda _1)||N_1)\oplus {\textsf {Hash}} _{\tau _c}(x_3)) \nonumber \\&\quad = {\textsf {tag}} ^{(1)} \oplus {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _c}(x_1)) \oplus {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _c}(x_3)) \end{aligned}$$

(17)

$$\begin{aligned}&\quad = x_c \oplus {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _c}(x_3)) . \end{aligned}$$

(18)

The expression in (17) comes from (14) and that in (18) comes from Step 12 in Algorithm 1. Hence, in this particular iteration of the do-while loop, ${\mathcal {R}}_v^{(3)} = {\textsf {true}} $ and the loop terminates.

Since $\lambda =n$, from Step 15 we obtain ${\textsf {F}} _K({\textsf {bin}} _8(\lambda )||N_2)={\textsf {Hash}} _{\tau _c}(x_4) \oplus {\textsf {tag}} ^{(4)}.$ For the choice of x in Step 16, i.e., $x\in {\mathcal {M}}\setminus \{x_4\}$ we have

$$\begin{aligned} {\textsf {nvMAC}} \text{- }{\textsf {t}} 1(N_2,x,\lambda )= & {} {\textsf {F}} _K({\textsf {bin}} _8(\lambda )||N_2)\oplus {\textsf {Hash}} _{\tau _c}(x) \nonumber \\= & {} {\textsf {Hash}} _{\tau _c}(x_4) \oplus {\textsf {tag}} ^{(4)} \oplus {\textsf {Hash}} _{\tau _c}(x), \end{aligned}$$

(19)

which is returned as the tag for $(N_2,x,\lambda )$ in the forgery and hence, the corresponding response from ${\mathcal {O}}_v$ is true with probability 1, which proves the first part of the result.

In the attack, there are 2 tag generation queries in Steps 4 and 15. The subroutine ${\textsf {findTag}} $ makes a maximum of $2^{\lambda _1}$ verification queries on tags of lengths $\lambda _1$. The do-while loop in Steps 7 to 14 iterates at most $2^{n - \lambda _1}$ times for different values of c making a maximum of $2^{n - \lambda _1}$ verification queries on tags of lengths $\lambda _1$. The forgery returned in Step 17 is a verification query on a tag of length $\lambda $. Hence, the attack requires 2 tag generation queries and at most $2^{\lambda _1}+ 2^{n-\lambda _1} + 1$ verification queries including the forgery. $\square $

Remarks

1.
One may note that this work considers variable length tags. So, the adversary can make verification queries for a particular tag length and provide a forgery for another tag length. The attack given in Algorithm 1 on the scheme ${\textsf {nvMAC\text{- }t}} 1$, forges the scheme with an n-bit tag, i.e. the attack is for tag length n; whereas, as shown in Proposition 3, the attack requires 2 tag generation queries and $2^{\lambda _1}+2^{n - \lambda _1}+1$ verification queries including the forgery, where $\lambda _1 < \lambda $. Among these queries, 1 tag generation query and $2^{\lambda _1}+2^{n - \lambda _1}$ verification queries are with tag length $\lambda _1$. For example, suppose $n=128$, and let $\lambda _1=64$. So, the attack uses $2^{65}+1 < 2^{128}$ verification queries and produces a forgery for tag length 128. This constitutes a valid attack for tag length 128.
2.
The security model for variable length tag nonce-based MAC allows nonces in tag generation queries to be repeated as long as the tag lengths are distinct. The attack in Algorithm 1 does not repeat nonces in tag generation queries. So, the scheme ${\textsf {nvMAC\text{- }t}} 1$ is insecure even under the restriction that nonces in tag generation queries are distinct.

Insecurities of the schemes ${\textsf {nvMAC\text{- }t}} 1$ to ${\textsf {nvMAC\text{- }t}} 5$ follow from applications of Algorithm 1.

Attack on ${\textsf {nvMAC\text{- }t}} 2$: Algorithm 1 works with the only modification that the forgery is changed to $(N_2 , x, {\textsf {Hash}} _{\tau _c} ({\textsf {bin}} _8 (\lambda )||x_4 ) \oplus {\textsf {tag}} ^{(4)} \oplus {\textsf {Hash}} _{\tau _c} ({\textsf {bin}} _8 (\lambda )||x), \lambda )$.
Attack on ${\textsf {nvMAC\text{- }t}} 3$: Algorithm 1 works with the only modification that the forgery is changed to $(N_2, x, {\textsf {Hash}} _{\tau _c}({\textsf {bin}} _8(\lambda )||x) \oplus {\textsf {Hash}} _{\tau _c}({\textsf {bin}} _8(\lambda )||x_4) \oplus {\textsf {tag}} ^{(4)}, \lambda )$.
Attack on ${\textsf {nvMAC\text{- }t}} 4$: Algorithm 1 works with the only modification that the forgery is changed to $(N_2, x, {\textsf {Hash}} _{\tau _c}(x) \oplus {\textsf {Hash}} _{\tau _c}(x_4) \oplus {\textsf {tag}} ^{(4)}, \lambda )$.
Attack on ${\textsf {nvMAC\text{- }t}} 5$: Algorithm 1 works with the only modification that the forgery is changed to $(N_2, x, {\textsf {Hash}} _{\tau _c}({\textsf {bin}} _8(\lambda )||x) \oplus {\textsf {Hash}} _{\tau _c}({\textsf {bin}} _8(\lambda )||x_4) \oplus {\textsf {tag}} ^{(4)}, \lambda )$.

The insecurity of ${\textsf {nvMAC\text{- }t}} 6$ is discussed in Appendix A.

The scheme ${\textsf {nvMAC\text{- }Generic}} $ can be considered to be a collection of $\#{\mathcal {L}}$ independent ${\textsf {WC\text{- }nvMAC}} $ schemes, one for each value of $\lambda $. Each of the individual schemes for fixed values of $\lambda $ are already known to be secure, since the proof from [5] applies to the individual schemes where the values of $\lambda $ are fixed. Since the keys of the various schemes are independent, it can be argued that the collection is also secure. The problem, however, is that size of the key increases by a factor of $\#{\mathcal {L}}$. So, ${\textsf {nvMAC\text{- }Generic}} $ cannot be considered to be a practical solution to the problem of obtaining a variable tag length MAC scheme.

The first step towards reducing key size is taken in the scheme ${\textsf {nvMAC}} $ which uses a single key K for ${\textsf {F}} $ and independent keys $\tau _{\lambda }$. In the next section, we prove ${\textsf {nvMAC}} $ to be secure and also consider further variants with smaller keys.

Remark

Suppose ${\textsf {nvMAC\text{- }t}} 1$ is modified to obtain a scheme ${\textsf {nvMAC\text{- }t}} 1^{\prime }$ in the following manner. The ${\textsf {tag}} $ is obtained as ${\textsf {msb}} _\lambda ({\textsf {F}} _K({\textsf {F}} _K({\textsf {bin}} _8(\lambda )||N)\oplus {\textsf {Hash}} _{\tau }(x)))$, i.e., a second application of ${\textsf {F}} _K$ is made before truncating. It is not difficult to show that the scheme mapping $(N,x,\lambda )$, under the key $(K,\tau )$, to the quantity ${\textsf {F}} _K({\textsf {F}} _K({\textsf {bin}} _8(\lambda )||N)\oplus {\textsf {Hash}} _{\tau }(x))$ is a PRF. It can be argued that ${\textsf {nvMAC\text{- }t}} 1^{\prime }$ is a secure variable tag length MAC scheme. However, the security bound for ${\textsf {nvMAC\text{- }t}} 1^{\prime }$ will be in the order of $q^2\varepsilon $, where the total number of queries is q and the hash function is $\varepsilon $-AXU. This bound is higher than the bounds obtained for the schemes that we consider. Hence, we do not consider ${\textsf {nvMAC\text{- }t}} 1^{\prime }$. In the above discussion, we have considered modification of ${\textsf {nvMAC\text{- }t}} 1$ to ${\textsf {nvMAC\text{- }t}} 1^{\prime }$. The same comments apply to similar modifications of the other insecure schemes, namely ${\textsf {nvMAC\text{- }t}} 2$ to ${\textsf {nvMAC\text{- }t}} 6$.

4 Secure and efficient MAC schemes with variable length tag

We start with the scheme ${\textsf {nvMAC}} $ given in (13). We carry out an information theoretic analysis of this scheme. To this end, we consider the scheme obtained by replacing ${\textsf {F}} _K$ with a random function $f:\{0,1\}^n\rightarrow \{0,1\}^n$. The tag generation algorithm for this scheme is shown in Table 2. We require a hash family $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$, where for each $\tau \in \Theta $, ${\textsf {Hash}} _{\tau }:{\mathcal {M}}\rightarrow \{0,1\}^n$, with ${\mathcal {M}}=\cup _{i=0}^L\{0,1\}^i$ for some sufficiently large positive integer L.

The nonce space for the scheme ${\textsf {nvMAC}} $ is ${\mathcal {N}}=\{0,1\}^{n-8}$ and the message space is ${\mathcal {M}}$. Let ${\mathcal {L}}\subseteq \{1,\ldots ,\min (256,n)\}$ be the allowed set of tag lengths. Note that tag length equal to zero is not allowed and there are 256 possible values of the tag length that are supported. If $\lambda $ is the tag length, then $\lambda -1$ is at most 255 and consequently fits within a byte. So, instead of encoding $\lambda $, we encode $\lambda -1$. This is a modification that we make to the scheme given in (13). Note that larger (or smaller) values of $\#{\mathcal {L}}$ can be considered by suitably adjusting the length of the nonces. From a practical point of view, however, it is difficult to think of any application which would require $\#{\mathcal {L}}$ to be more than 256.

The key space for ${\textsf {nvMAC}} $ is $\Theta ^{\#{\mathcal {L}}}$, i.e., a particular key is a tuple $(\tau _{\lambda })_{\lambda \in {\mathcal {L}}}$. The key generation algorithm consists of choosing $\tau _{\lambda }$ independently and uniformly at random from $\Theta $ for each $\lambda $. The verification algorithm is as follows. Given $(N,x,{\textsf {tag}} , \lambda )$, compute ${\textsf {tag}} ^{\prime }={\textsf {nvMAC}} .{\textsf {Gen}} _{(\tau _{\lambda })_{\lambda \in {\mathcal {L}}}}(N,x,\lambda )$; if ${\textsf {tag}} ={\textsf {tag}} ^{\prime }$ then return ${\textsf {true}} $, else return ${\textsf {false}} $.

Here f is a random function but, not necessarily a uniform random function. Given q pairs $(a_1,b_1),\ldots ,(a_q,b_q)$, the q-interpolation probability [5] of f is defined to be $\Pr [f(a_1)=b_1,\ldots ,f(a_q)=b_q]$. Following the analysis in [5], the security bound for the resulting scheme is obtained in terms of the interpolation probability of f. Known bounds on the interpolation probability of uniform random function and uniform random permutation provide the corresponding bounds on the security of the resulting nvMAC schemes.

Table 2 A secure and efficient nvMAC scheme from a random function

Full size table

Theorem 1

In the scheme ${\textsf {nvMAC}} $ defined in Table 2, suppose that the hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ is $\varepsilon $-AXU, where $\varepsilon (\ell ,\ell ^{\prime }) \ge 1/2^n$ for all $\ell ,\ell ^{\prime }\le L$.

Fix a query profile ${\mathfrak {C}}$. For $\lambda \in {\mathcal {L}}$, let $q_{g,\lambda }$ (resp. $q_{v,\lambda }$) be the number of tag generation (resp. verification) queries for $\lambda $ which are in ${\mathfrak {C}}$. Let $\lambda $ be such that $q_{v,\lambda }\ge 1$ and for $1\le i\le q_{v,\lambda }$, let $Q_{v,\lambda }^{(i)}=(N_{v,\lambda }^{(i)},x_{v,\lambda }^{(i)},{\textsf {tag}} _{v,\lambda }^{(i)},\lambda )$ be the i-th verification query with tag length $\lambda $. Let $\ell _{v,\lambda }^{(i)}={\textsf {len}} (x_{v,\lambda }^{(i)})$. Corresponding to $Q_{v,\lambda }^{(i)}$, there is at most one tag generation query $Q_{g,\lambda }^{(i^{\star })}=(N_{g,\lambda }^{(i^{\star })},x_{g,\lambda }^{(i^{\star })},\lambda )$ such that $N_{v,\lambda }^{(i)}=N_{g,\lambda }^{(i^{\star })}$. Let $\ell _{g,\lambda }^{(i^{\star })}={\textsf {len}} (x_{g,\lambda }^{(i^{\star })})$ if there is such a $Q_{g,\lambda }^{(i^{\star })}$, otherwise $\ell _{g,\lambda }^{(i^{\star })}$ is undefined.

Fix $\lambda _0\in {\mathcal {L}}$. Let ${\mathcal {S}}_{\lambda _0}$ be the set of all queries made by the adversary other than the verification queries for tag length $\lambda _0$. Suppose that the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to at most q distinct (nonce, tag-length) values. Further, suppose $\delta _i$ be such that the i-interpolation probability of f is at most $\delta _i/(2^n)^{i}$. Then

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0](t,{\mathfrak {C}})\le & {} \frac{1}{2^{\lambda _0}} \times \sum _{1\le i \le q_{v,\lambda _0}}\gamma _i \end{aligned}$$

(20)

where $\gamma _i=2^n\delta _q\varepsilon \left( \ell _{v,\lambda _0}^{(i)},\ell _{g,\lambda _0}^{(i^{\star })}\right) $ if there is a $Q_{g,\lambda _0}^{(i^{\star })}$ corresponding to $Q_{v,\lambda _0}^{(i)}$ with $N_{v,\lambda _0}^{(i)}=N_{g,\lambda _0}^{(i^{\star })}$; otherwise $\gamma _i=\delta _{q+1}$.

Remark

It has been proved in [5], that for $1\le j\le 2^n$, if f is a uniform random function, then $\delta _j=1$, and if f is a uniform random permutation, then $\delta _j \le (1-(j-1)/2^n)^{-j/2}$.

Proof

The proof builds upon and generalises ideas used in the security proof of the Wegman-Carter nonce-based MAC scheme given in [5].

Let ${\mathcal {A}}$ be an adversary attacking the authenticity of ${\textsf {nvMAC}} $. The result concerns information theoretic security and so we consider the adversary to be deterministic. ${\mathcal {A}}$ makes a number of queries to its oracles and receives the appropriate responses. The interaction of ${\mathcal {A}}$ with its two oracles is given by a transcript ${\mathcal {T}}$ which is a list of the queries made by ${\mathcal {A}}$ and the responses it received in return. The adversary’s view of the oracles is completely determined by the transcript ${\mathcal {T}}$. By ${\mathcal {A}}({\mathcal {T}})$, we will denote the interaction of ${\mathcal {A}}$ with the oracles as given by the transcript ${\mathcal {T}}$. The responses to the queries made by ${\mathcal {A}}$ are computed using the random function f and hence are random variables. Since ${\mathcal {A}}$ is deterministic, the randomness in a transcript ${\mathcal {T}}$ arises only from these responses. By ${\textsf {succ}} ({\mathcal {A}}({\mathcal {T}}),\lambda _0)$ we will denote the event that the adversary ${\mathcal {A}}$ with transcript ${\mathcal {T}}$ makes a verification query for tag length $\lambda _0$ which returns ${\textsf {true}} $. So, if the transcript ${\mathcal {T}}$ corresponds to the query profile ${\mathfrak {C}}$, then ${\textsf {Adv}} _{\textsf {nvMAC}} ^\mathrm{\small auth}[\lambda _0](t, {\mathfrak {C}})=\Pr [{\textsf {succ}} ({\mathcal {A}}({\mathcal {T}}),\lambda _0)]$.

The first reduction is to assume that $q_{v,\lambda _0} = 1$. If $q_{v,\lambda _0} = 0$, i.e., ${\mathcal {A}}$ does not make any verification query, then clearly, ${\mathcal {A}}$ has advantage 0 so that the theorem is trivially proved. So, suppose that ${\mathcal {A}}$ with transcript ${\mathcal {T}}$ makes $q_{v,\lambda _0}>1$ verification queries for tag-length $\lambda _0$. Let ${\mathcal {E}}$ be the event that the first verification query for the tag length $\lambda _0$ is successful and ${\mathcal {S}}$ be the event that one of the later verification queries for the tag length $\lambda _0$ is successful. So,

$$\begin{aligned} {\textsf {Adv}} _{\textsf {nvMAC}} ^\mathrm{\small auth}[\lambda _0](\mathcal {A}) = \Pr [{\textsf {succ}} (\mathcal {A}(\mathcal {T}),\lambda _0)] = \Pr [\mathcal {E} \vee \mathcal {S}]= & {} \Pr [\mathcal {E} \vee (\overline{\mathcal {E}} \wedge \mathcal {S})] \\= & {} \Pr [\mathcal {E}] + \Pr [\overline{\mathcal {E}} \wedge \mathcal {S}]. \end{aligned}$$

Given the adversary ${\mathcal {A}}$ and the transcript ${\mathcal {T}}$, we define two adversaries ${\mathcal {A}}^{\prime }$ and ${\mathcal {A}}^{\prime \prime }$ and correspondingly two transcripts ${\mathcal {T}}^{\prime }$ and ${\mathcal {T}}^{\prime \prime }$ in the following manner.

Adversary ${\mathcal {A}}^{\prime }$ is the same as ${\mathcal {A}}$ up to and including the first verification query for tag length $\lambda _0$; the transcript ${\mathcal {T}}^{\prime }$ is obtained from ${\mathcal {T}}$ by dropping from ${\mathcal {T}}$ all queries after the first verification query for tag length $\lambda _0$. So, $\Pr [{\textsf {succ}} ({\mathcal {A}}^{\prime }({\mathcal {T}}^{\prime }),\lambda _0)]=\Pr [{\mathcal {E}}]$.
Adversary ${\mathcal {A}}^{\prime \prime }$ is the same as ${\mathcal {A}}$ except for the first verification query for tag length $\lambda _0$. ${\mathcal {A}}^{\prime \prime }$ does not issue the first verification query for tag length $\lambda _0$. The transcript ${\mathcal {T}}^{\prime \prime }$ is the same as that of ${\mathcal {T}}$ except that in ${\mathcal {T}}^{\prime \prime }$, the answer to the first verification query for tag length $\lambda _0$ is set to be ${\textsf {false}} $^{Footnote 1}. The event $\overline{{\mathcal {E}}} \wedge {\mathcal {S}}$ captures the following situation for ${\mathcal {A}}$ on the transcript ${\mathcal {T}}$: the response to the first verification query for tag length $\lambda _0$ is ${\textsf {false}} $ and ${\mathcal {A}}$ is successful on some later verification query for tag length $\lambda _0$. Note that this situation is exactly the event that ${\mathcal {A}}^{\prime \prime }$ is successful for tag length $\lambda _0$ on transcript ${\mathcal {T}}^{\prime \prime }$. So, $\Pr [{\textsf {succ}} ({\mathcal {A}}^{\prime \prime }({\mathcal {T}}^{\prime \prime }),\lambda _0)]=\Pr [\overline{{\mathcal {E}}} \wedge {\mathcal {S}}]$.

Note that ${\mathcal {A}}^{\prime \prime }$ makes $q_{v,\lambda _0}-1$ verification queries for tag length $\lambda _0$. So, the problem of proving the result for $q_{v,\lambda _0}$ verification queries has been reduced to the problem of proving the result for $q_{v,\lambda _0}-1$ verification queries. Proceeding by induction, to prove the bound given in (20), it is sufficient to consider an adversary which makes exactly one verification query for tag length $\lambda _0$. Let the single verification query for tag length $\lambda _0$ be $(N,x,{\textsf {tag}} ,\lambda _0)$.

The second reduction is to ignore all queries in ${\mathcal {T}}$ after the verification query for tag length $\lambda _0$. Such queries have no effect on the success probability of the verification query for tag length $\lambda _0$.

The third reduction is the following. If the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to less than q distinct (nonce, tag-length) values, then insert additional tag generation queries to the transcript with (nonce, tag-length) values not equal to $(N,\lambda _0)$ such that the queries in the augmented ${\mathcal {S}}_{\lambda _0}$ give rise to exactly q distinct (nonce, tag-length) values. Such augmentation of the transcript does not decrease the adversary’s advantage.

In view of the above reductions, it is sufficient to consider an adversary ${\mathcal {A}}$ with a transcript ${\mathcal {T}}$ where the last query is the verification query $(N,x,{\textsf {tag}} ,\lambda _0)$ for tag length $\lambda _0$ and the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to exactly q distinct (nonce, tag-length) values. The transcript ${\mathcal {T}}$ can contain any number of tag generation queries for the tag length $\lambda _0$. However, by the restriction that among the tag generation queries, the (nonce, tag-length) pair cannot repeat, ${\mathcal {T}}$ can contain at most one tag generation query of the form $(N,x^{\prime },\lambda _0)$. For $\lambda \ne \lambda _0$, the transcript ${\mathcal {T}}$ can contain multiple verification queries with the same value for the (nonce, $\lambda $) pair. So, the total number of queries in ${\mathcal {S}}_{\lambda _0}$ can be greater than q.

Let ${\mathfrak {N}}={\textsf {bin}} _8(\lambda _0-1)||N$, $Q=f({\mathfrak {N}})$ and $\tau _0=\tau _{\lambda _0}$. Let the q distinct values of (nonce, tag-length) pairs arising from the queries in ${\mathcal {S}}_{\lambda _0}$ be $(N^{(1)},\lambda ^{(1)}),\ldots ,(N^{(q)},$ $\lambda ^{(q)})$. For $i=1,\ldots ,q$, let ${\mathfrak {N}}^{(i)}={\textsf {bin}} _8(\lambda ^{(i)}-1)||N^{(i)}$ and $Q^{(i)}=f({\mathfrak {N}}^{(i)})$. Define ${\mathbf {Q}}=(Q^{(1)},\ldots ,Q^{(q)})$. Let $q^{\prime }$ be the number of distinct tag-length values arising from the queries in ${\mathcal {S}}_{\lambda _0}$ and let $\lambda ^{(1)},\ldots ,\lambda ^{(q^{\prime })}$ be these tag lengths. For $i=1,\ldots ,q^{\prime }$, define $\tau _i=\tau _{\lambda ^{(i)}}$ and $\varvec{\tau }=(\tau _1,\ldots ,\tau _{q^{\prime }})$. The entire randomness in the transcript arises from ${\mathbf {Q}}$ and $\varvec{\tau }$.

Consider the final verification query $(N,x,{\textsf {tag}} ,\lambda _0)$ and let $\ell ={\textsf {len}} (x)$. Let $\ell ^{(\star )}={\textsf {len}} (x^{(\star )})$ if there is a prior tag generation query $(N^{(\star )},x^{(\star )},\lambda ^{(\star )})$ (with response ${\textsf {tag}} ^{(\star )}$) such that $N^{(\star )}=N$ and $\lambda ^{(\star )}=\lambda _0$; otherwise, $\ell ^{(\star )}$ is undefined. Let $\gamma =2^n\delta _q\varepsilon (\ell ,\ell ^{(\star )})$ if $\ell ^{(\star )}$ is defined, otherwise, $\gamma =\delta _{q+1}$. To prove the theorem, it is sufficient to show

$$\begin{aligned} \Pr [{\textsf {succ}} ({\mathcal {A}}({\mathcal {T}}),\lambda _0)]\le & {} \gamma /2^{\lambda _0}. \end{aligned}$$

(21)

The verification query is successful if ${\textsf {tag}} ={\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x))$. So,

$$\begin{aligned} \Pr [{\textsf {succ}} ({\mathcal {A}}({\mathcal {T}}),\lambda _0)]= & {} \Pr [{\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x))={\textsf {tag}} ]. \end{aligned}$$

(22)

We consider the probability on the right hand side of (22) under two cases.

The first case is when there is no tag generation query having (nonce, tag-length) pair to be equal to $(N,\lambda _0)$ in ${\mathcal {T}}$. In this case, ${\mathfrak {N}}^{(1)},\ldots ,{\mathfrak {N}}^{(q)},{\mathfrak {N}}$ are distinct values to which f is applied. Since the adversary is adaptive, the x and ${\textsf {tag}} $ in the final verification query are functions of the earlier responses it received and in turn are functions of ${\mathbf {Q}}$ and $\varvec{\tau }$. We write $x\equiv x({\mathbf {Q}},\varvec{\tau })$ and ${\textsf {tag}} \equiv {\textsf {tag}} ({\mathbf {Q}},\varvec{\tau })$ to denote this functional dependence. We would like to emphasise that the adversary does not have access to ${\mathbf {Q}}$ and $\varvec{\tau }$ and writing x and ${\textsf {tag}} $ as functions of ${\mathbf {Q}}$ and $\varvec{\tau }$ is only to help in the argument. Let a and ${\mathbf {a}}$ be arbitrary values of $\tau _0$ and $\varvec{\tau }$. Let $b_1,\ldots ,b_q$ be arbitrary n-bit strings and let ${\mathbf {b}}=(b_{1},\ldots ,b_{q})$. So,

$$\begin{aligned}&\Pr [{\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau })))={\textsf {tag}} ({\mathbf {Q}},\varvec{\tau })] \nonumber \\&\quad = \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},\varvec{\tau }) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau })))] \nonumber \\&\quad = \sum _{{\mathbf {a}},a} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},\varvec{\tau }) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau }))) \wedge (\varvec{\tau }={\mathbf {a}})\wedge (\tau _0=a)] \nonumber \\&\quad = \sum _{{\mathbf {a}},a} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {Q}},{\mathbf {a}}))) \wedge (\varvec{\tau }={\mathbf {a}})\wedge (\tau _0=a)] \nonumber \\&\quad = \sum _{{\mathbf {a}},a} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {Q}},{\mathbf {a}})))] \Pr [(\varvec{\tau }={\mathbf {a}})\wedge (\tau _0=a)]. \nonumber \\ \end{aligned}$$

(23)

Let c be an arbitrary $(n-\lambda _0)$-bit binary string. We consider

$$\begin{aligned}&\Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {Q}},{\mathbf {a}})))] \nonumber \\&\quad = \sum _{{\mathbf {b}}} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {Q}},{\mathbf {a}}))) \wedge ({\mathbf {Q}}={\mathbf {b}})] \nonumber \\&\quad = \sum _{{\mathbf {b}}} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {b}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {b}},{\mathbf {a}}))) \wedge ({\mathbf {Q}}={\mathbf {b}})] \nonumber \\&\quad = \sum _{{\mathbf {b}}} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = b \wedge ({\mathbf {Q}}={\mathbf {b}})] \nonumber \\&\quad \quad \quad \quad \quad \quad \quad \left( \text{ where } b = {\textsf {tag}} ({\mathbf {b}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {b}},{\mathbf {a}})))\right) \nonumber \\&\quad = \sum _{{\mathbf {b}}} \Pr [{\textsf {msb}} _{\lambda _0}(f({\mathfrak {N}}))=b,f({\mathfrak {N}}^{(1)})=b_{1},\ldots ,f({\mathfrak {N}}^{(q)})=b_{q}] \nonumber \\&\quad = \sum _{{\mathbf {b}}} \sum _{c} \Pr [f({\mathfrak {N}})=b||c,f({\mathfrak {N}}^{(1)})=b_{1},\ldots ,f({\mathfrak {N}}^{(q)})=b_{q}] \nonumber \\&\quad \le \sum _{{\mathbf {b}}} 2^{n-\lambda _0}\delta _{q+1}/(2^n)^{q+1} \nonumber \\&\quad = 2^{n-\lambda _0}\delta _{q+1}/2^n = \gamma /2^{\lambda _0}. \end{aligned}$$

(24)

Combining (23) and (24), we have

$$\begin{aligned}&\Pr [{\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau })))={\textsf {tag}} ({\mathbf {Q}},\varvec{\tau })] \nonumber \\&\quad = \sum _{{\mathbf {a}},a} \Pr [{\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x({\mathbf {Q}},{\mathbf {a}})))] \Pr [(\varvec{\tau }={\mathbf {a}})\wedge (\tau _0=a)] \nonumber \\&\quad \le \gamma /2^{\lambda _0} \sum _{{\mathbf {a}},a} \Pr [(\varvec{\tau }={\mathbf {a}})\wedge (\tau _0=a)] = \gamma /2^{\lambda _0}. \end{aligned}$$

(25)

This proves the first case.

In the second case, let the transcript ${\mathcal {T}}$ be such that there is a tag generation query $(N^{(\star )},x^{(\star )},\lambda ^{(\star )})$ (with response ${\textsf {tag}} ^{(\star )}$) where $N^{(\star )}=N$ and $\lambda ^{(\star )}=\lambda _0$. Note that by the query restriction on the adversary, $x^{(\star )}\ne x$. Let ${\mathfrak {N}}^{(\star )}={\textsf {bin}} _8(\lambda ^{(\star )}-1)||N^{(\star )}$, $Q^{(\star )}=f({\mathfrak {N}}^{(\star )})$ and $\tau _{\star }=\tau _{\lambda ^{(\star )}}$. Then $Q^{(\star )}=Q$ and $\tau _{\star }=\tau _0$. Let ${\mathbf {Q}}$ be the vector consisting of $Q^{(1)},\ldots ,Q^{(q)}$ but, not containing $Q^{(\star )}$ and let $\varvec{\tau }$ be the vector consisting of $\tau _1,\ldots ,\tau _{q^{\prime }}$ but, not containing $\tau _{\star }$. So, ${\mathbf {Q}}$ is a vector having $q-1$ components and $\varvec{\tau }$ is a vector having $q^{\prime }-1$ components. In this case, $x\equiv x({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})$ and ${\textsf {tag}} \equiv {\textsf {tag}} ({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})$. As in the earlier argument, we highlight that the adversary does not have access to ${\mathbf {Q}}$ and $\varvec{\tau }$ and writing x and ${\textsf {tag}} $ as functions of ${\mathbf {Q}}$ and $\varvec{\tau }$ (and also ${\textsf {tag}} ^{(\star )}$) is to help in the argument. Due to the adaptive nature of the adversary, $x^{(\star )}$ is also a function of portions of ${\mathbf {Q}}$ and $\varvec{\tau }$ which corresponds to the queries earlier to $(N^{(\star )},x^{(\star )},\lambda ^{(\star )})$. Hence, we write $x^{(\star )}\equiv x^{(\star )}({\mathbf {Q}},\varvec{\tau })$. Note that $\tau _0$ is independent of $\varvec{\tau }$.

Let ${\mathbf {a}}$ and ${\textsf {t}} $ be arbitrary values for $\varvec{\tau }$ and ${\textsf {tag}} ^{(\star )}$ respectively. Then

$$\begin{aligned}&\Pr [{\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})))={\textsf {tag}} ({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})] \nonumber \\&\quad = \sum _{{\mathbf {a}}, {\textsf {t}} } \Pr [({\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})))={\textsf {tag}} ({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})) \wedge (\varvec{\tau }={\mathbf {a}}) \nonumber \\&\qquad \wedge ({\textsf {tag}} ^{(\star )}={\textsf {t}} )] \nonumber \\&\quad = \sum _{{\mathbf {a}}, {\textsf {t}} } \Pr [({\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} )))={\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} )) \nonumber \\&\qquad \wedge ({\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}({\mathbf {Q}},{\mathbf {a}})))=t) \wedge (\varvec{\tau }={\mathbf {a}})] \nonumber \\&\quad = \sum _{{\mathbf {a}}} \Bigl ( \sum _{t}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ))\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}({\mathbf {Q}},{\mathbf {a}})))={\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} )\oplus {\textsf {t}} ) \nonumber \\&\qquad \wedge ({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ))))] \Bigr ) \times \Pr [\varvec{\tau }={\mathbf {a}}]. \end{aligned}$$

(26)

Let ${\mathbf {b}}$ and a be an arbitrary value of ${\mathbf {Q}}$ and $\tau _0$. Let $c_1$ and $c_2$ be arbitrary $(n-\lambda _0)$-bit strings. We consider

$$\begin{aligned}&\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ))\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}({\mathbf {Q}},{\mathbf {a}})))={\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} )\oplus {\textsf {t}} ) \\&\qquad \wedge ({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ))))] \\&\quad =\sum _{{\mathbf {b}}} \Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ))\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}({\mathbf {Q}},{\mathbf {a}})))={\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} )\oplus {\textsf {t}} ) \\&\qquad \wedge ({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},{\mathbf {a}}, {\textsf {t}} ))))\wedge ({\mathbf {Q}}={\mathbf {b}})] \\&\quad =\sum _{{\mathbf {b}}} \Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {b}},{\mathbf {a}}, {\textsf {t}} ))\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}({\mathbf {b}},{\mathbf {a}})))={\textsf {tag}} ({\mathbf {b}},{\mathbf {a}}, {\textsf {t}} )\oplus {\textsf {t}} ) \\&\qquad \wedge ({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} ({\mathbf {b}},{\mathbf {a}}, {\textsf {t}} ) \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x({\mathbf {b}},{\mathbf {a}}, {\textsf {t}} ))))\wedge ({\mathbf {Q}}={\mathbf {b}})] \end{aligned}$$

To simplify notation, we write $x({\mathbf {b}},{\mathbf {a}},{\textsf {t}} )$ as x, $x^{\star }({\mathbf {b}},{\mathbf {a}})$ as $x^{\star }$ and ${\textsf {tag}} ({\mathbf {b}},{\mathbf {a}},{\textsf {t}} )$ as ${\textsf {tag}} $. So, we have

$$\begin{aligned}&\sum _{{\mathbf {b}}}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x)\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \\&\qquad \wedge ({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x))) \wedge ({\mathbf {Q}} = {\mathbf {b}})] \\&\quad =\sum _{{\mathbf {b}},a}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)\oplus {\textsf {Hash}} _{a}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \\&\quad \quad \quad \quad \quad \quad \wedge ({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x))) \wedge ({\mathbf {Q}} = {\mathbf {b}})\wedge (\tau _0 = a)] \\&\quad =\sum _{{\mathbf {b}},a}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)\oplus {\textsf {Hash}} _{a}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \wedge (\tau _0 = a)] \\&\qquad \times \Pr [({\textsf {msb}} _{\lambda _0}(Q) = {\textsf {tag}} \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x))) \wedge ({\mathbf {Q}} = {\mathbf {b}})] \\&\quad =\sum _{{\mathbf {b}},a}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)\oplus {\textsf {Hash}} _{a}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \wedge (\tau _0 = a)] \\&\qquad \times \left( \sum _{c_1} \Pr [(Q = ({\textsf {tag}} \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)))||c_1) \wedge ({\mathbf {Q}} = {\mathbf {b}})]\right) \end{aligned}$$

Let $b=({\textsf {tag}} \oplus {\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)))||c_1)$. Then $\Pr [(Q=b) \wedge ({\mathbf {Q}} = {\mathbf {b}})]$ is bounded from above by the q-interpolation probability of f. So, we have

$$\begin{aligned}&\sum _{{\mathbf {b}},a}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)\oplus {\textsf {Hash}} _{a}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \wedge (\tau _0 = a)] \nonumber \\&\qquad \times \left( \sum _{c_1} \Pr [(Q = b) \wedge ({\mathbf {Q}} = {\mathbf {b}})]\right) \nonumber \\&\quad \le \sum _{{\mathbf {b}},a}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)\oplus {\textsf {Hash}} _{a}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \wedge (\tau _0 = a)] \times 2^{n-\lambda _0}\frac{\delta _q}{(2^n)^q} \nonumber \\&\quad =2^{n-\lambda _0}\frac{\delta _q}{(2^n)^q} \times \sum _{{\mathbf {b}},a}\Pr [({\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{a}(x)\oplus {\textsf {Hash}} _{a}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ) \wedge (\tau _0 = a)] \nonumber \\&\quad =2^{n-\lambda _0}\delta _q/(2^n)^q \times \sum _{{\mathbf {b}}}\Pr [{\textsf {msb}} _{\lambda _0}({\textsf {Hash}} _{\tau _0}(x)\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )}))={\textsf {tag}} \oplus {\textsf {t}} ] \nonumber \\&\quad =2^{n-\lambda _0}\delta _q/(2^n)^q \times \sum _{{\mathbf {b}}}\sum _{c_2}\Pr [{\textsf {Hash}} _{\tau _0}(x)\oplus {\textsf {Hash}} _{\tau _0}(x^{(\star )})=({\textsf {tag}} \oplus {\textsf {t}} )||c_2] \nonumber \\&\quad \le 2^{n-\lambda _0}\delta _q/(2^n)^q \times \sum _{{\mathbf {b}}} 2^{n-\lambda _0}\varepsilon (\ell ,\ell ^{(\star )}) \nonumber \\&\quad =2^{n-\lambda _0}\delta _q/(2^n)^q \times (2^n)^{q-1}\times 2^{n-\lambda _0}\varepsilon (\ell ,\ell ^{(\star )}) \nonumber \\&\quad =2^{n-2\lambda _0}\delta _q\varepsilon (\ell ,\ell ^{(\star )}). \end{aligned}$$

(27)

Combining (26) and (27), we have,

$$\begin{aligned}&\Pr [{\textsf {msb}} _{\lambda _0}(Q\oplus {\textsf {Hash}} _{\tau _0}(x({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})))={\textsf {tag}} ({\mathbf {Q}},\varvec{\tau }, {\textsf {tag}} ^{(\star )})] \nonumber \\&\quad \le \sum _{{\mathbf {a}}} \Bigl ( \sum _{t}2^{n-2\lambda _0}\delta _q\varepsilon (\ell ,\ell ^{(\star )}) \Bigr ) \times \Pr [\varvec{\tau }={\mathbf {a}}]\nonumber \\&\quad = \sum _{t}2^{n-2\lambda _0}\delta _q\varepsilon (\ell ,\ell ^{(\star )}) \times \sum _{{\mathbf {a}}}\Pr [\varvec{\tau }={\mathbf {a}}]\nonumber \\&\quad = 2^{\lambda _0}2^{n-2\lambda _0}\delta _q\varepsilon (\ell ,\ell ^{(\star )}) \nonumber \\&\quad = 2^{n-\lambda _0}\varepsilon (\ell ,\ell ^{(\star )})\delta _q = \gamma /2^{\lambda _0}. \end{aligned}$$

(28)

This proves the second case.

$\square $

Tightness of the security bound: The scheme ${\textsf {nvMAC}} $ is obtained as a variant of the Wegman-Carter scheme. The statement and proof of Theorem 1 follows the bound on the Wegman-Carter scheme established by Bernstein [5]. As mentioned earlier, Bernstein’s bound has been proved to be tight [17, 20]. A natural question is to consider whether the bound of Theorem 1 is also tight. We have considered this question for ${\textsf {nvMAC}} $. It does not seem possible to use the proof approach used in [17, 20] to show the tightness of the bound in Theorem 1. In fact, the approach does not also seem to work for the generic scheme ${\textsf {nvMAC}} $-${\textsf {Generic}} $.

The security bound of Theorem 1in terms of query complexity: The statement of Theorem 1 and the security bound provided in it are in terms of query profile. If it is to be translated to terms of query complexity, the following point is to be noted. The hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ may be such that, the differential probability of the hash function may depend on the lengths of the particular queries. For example, if $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ is a polynomial hash, the degree of the polynomial formed from the messages and hence the corresponding differential probability is a function of the lengths of the messages. The details of this variation in the query lengths are lost when we move from the notion of query profile to the notion of query complexity. As a result, the variability in the differential probability also cannot be captured when the security is considered in terms of query complexity. In this case, a uniformity is required in the probability and to attain that, the maximum of all the differential probabilities is considered. As a result, the security bound obtained in terms of query complexity is not precise and depending on the particular queries made by the adversary, it may be an over-estimation by a large margin. Hence, in the detailed security analysis we consider the notion of query profile and the security in terms of query complexity has been mentioned in respective corollaries.

The statement of Theorem 1 and the security bound provided in it look as follows in terms of query complexity.

Corollary 1

In the scheme ${\textsf {nvMAC}} $ defined in Table 2, suppose that the hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ be such that for any distinct $x, x^{\prime } \in {\mathcal {M}}$ and any $y \in \{0,1\}^n$, $\Pr [{\textsf {Hash}} _{\tau }(x) \oplus {\textsf {Hash}} _{\tau }(x^{\prime }) = y]$ is at most $\varepsilon \ge 1/2^n$, i.e. $\varepsilon $ is the maximum of the differential probabilities for all combination of messages.

For $\lambda \in {\mathcal {L}}$, let $q_{g,\lambda }$ (resp. $q_{v,\lambda }$) be the number of tag generation (resp. verification) queries for $\lambda $. Let the total number of bits in the tag generation queries be $\sigma _g$ and that in the verification queries be $\sigma _v$. Fix $\lambda _0\in {\mathcal {L}}$. Let ${\mathcal {S}}_{\lambda _0}$ be the set of all queries made by the adversary other than the verification queries for tag length $\lambda _0$. Suppose that the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to at most q distinct (nonce, tag-length) values. Further, suppose $\delta _q$ be such that the q-interpolation probability of f is at most $\delta _q/(2^n)^{q}$ and $(q+1)$-interpolation probability of f is at most $(\delta _q\varepsilon )/(2^n)^{q}$. Then

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0](t,\sigma _g,\sigma _v) \le 2^{n-\lambda _0} q_{v,\lambda _0}\delta _q\varepsilon . \end{aligned}$$

(29)

Essentially, in this case the bound is similar to the bound given in the security proof of the Wegman-Carter nonce-based MAC scheme given in [5]. If in some case the actual queries are such that the corresponding differential probabilities are much less than the maximum value, then this bound becomes much higher than the actual advantage of the adversary, i.e. the bound becomes more loose. Let us consider a numerical example to illustrate this scenario.

In this example, we will consider Horner’s rule based hash function and the underlying field to be ${\mathbb {F}}_{2^n}$. The differential probability of the Horner’s rule based hash for two distinct messages of length $\ell $ and $\ell ^{\prime }$, where $\ell \ge \ell ^{\prime }$, is given by $\varepsilon (\ell , \ell ^{\prime }) = \ell /2^n$. For ease of understanding, in this example let us consider $\delta _q =1$, which is true for a uniform random function. Let $n=128$, $\lambda _0 = 96$, $q_{v, \lambda _0} = 1$. Let us consider an (rather artificial) upper limit of $2^{20}$ n-bit blocks on the length of the message the adversary can query on. We consider some scenarios and the corresponding query profile based advantages.

Scenario 1: For tag length $\lambda _0$, let the adversary make 1 tag generation query and 1 verification query, each on a message containing 512 blocks. The differential probability reflected in the bound (20) is $\varepsilon (512,512) = 2^9/2^{128}$ and the corresponding bound becomes $2^{-87}$.
Scenario 2: For tag length $\lambda _0$, let the adversary make 1023 tag generation queries and 1 verification query, each on a message containing 1 block. Let one of the tag generation queries have the same nonce as the verification query. Then, the differential probability reflected in the bound (20) is $\varepsilon (1,1) = 1/2^{128}$ and the corresponding bound becomes $2^{-96}$.
Scenario 3: For tag length $\lambda _0$, let the adversary make one tag generation query and one verification query on messages having $2^{20}$ blocks and the same nonce. Then, the differential probability reflected in the bound (20) is $\varepsilon (2^{20},2^{20}) = 2^{20}/2^{128}$ and the corresponding bound becomes $2^{-76}$.

Let us now consider the query complexity based advantage for the above scenarios. Looking at the bound in (29), we have no clue about which value of the differential probability to be used here. The reason is, in this case, we only have the information regarding the total query complexity, but we do not know the length of each message. As a result, we are forced to use the maximum value of the differential probability which is obtained for $2^{20}$-block messages resulting in the differential probability to be $2^{20}/2^{128}$. The corresponding bound given by (29) in all three scenarios becomes $2^{-76}$. So, we see that even though the query complexities in Scenarios 1 and 2 is 1024 blocks and the query complexity in Scenario 3 is $2^{21}$ blocks, the query complexity based advantage in all three cases are the same. This illustrates that compared to the query complexity based advantage, the query profile based advantage provides a more granular information about the advantage.

It is to be noted that, the bound given by Bernstein [5] in the security proof of the Wegman-Carter nonce-based MAC scheme is $q_{v, \lambda _0}\delta _q\varepsilon $. This bound also lacks the information of particular message lengths. Hence, the difficulty stated above in case of complexity based advantage is applicable for this bound as well.

We have highlighted the differences between query profile based and query complexity based advantages. Also, we have provided bounds for both kinds of advantages. Depending on the requirement, one may use the appropriate kind of advantage and the corresponding bound.

4.1 Reducing key size

In a practical instantiation of ${\textsf {nvMAC}} $, the random function f will be instantiated by a keyed function ${\textsf {F}} _K$. The key for the entire scheme will consist of the key K along with the $\#{\mathcal {L}}$ keys $(\tau _{\lambda })_{\lambda \in {\mathcal {L}}}$ for the hash function ${\textsf {Hash}} $. Depending on the size of ${\mathcal {L}}$, for certain applications, the size of the key may be too large. Our next constructions show how to obtain nvMAC schemes with short keys.

The hash family $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$, the nonce space ${\mathcal {N}}$, the message space ${\mathcal {M}}$, the set of allowed tag lengths ${\mathcal {L}}$ and the tag space remain the same as in the case of ${\textsf {nvMAC}} $.

Our goal is to derive the key for the hash function by applying a PRF to the concatenation of the tag length and the nonce. Depending upon the actual choice of the hash function, the key could either be an n-bit string (or, a string of some fixed length which is at least n), or, it could be a variable length string which depends upon the length of the message. Typical examples of hash function where the key is a fixed length string is the polynomial hash or the BRW hash [4, 6, 19] while typical examples of hash function where the key depends upon the length of the message is either the multi-linear hash [14], or the pseudo-dot product [30], or the UMAC [8] construction.

Table 3 A secure and efficient nvMAC scheme using a stream cipher supporting an initialisation vector

Full size table

We consider the key of the hash function to be a sequence of n-bit blocks with the last block possibly being a partial block. Given the hash function ${\textsf {Hash}} $ and a message x, let ${\mathfrak {b}}(x)$ denote the number of n-bit blocks of key material required by ${\textsf {Hash}} $ to process the message x. As mentioned above, depending upon the choice of ${\textsf {Hash}} $, ${\mathfrak {b}}(x)$ could be independent of x (i.e., ${\textsf {Hash}} $ uses fixed length keys), or, it could depend upon x (i.e., ${\textsf {Hash}} $ uses a key which depends upon the length of x).

We start by constructing a nonce-based MAC scheme from a stream cipher supporting an initialisation vector. The assumption on such a stream cipher is that it is a PRF [2]. Formally, we use the PRF $\{{\textsf {SC}} _K\}_{K\in {\mathcal {K}}}$, where ${\textsf {SC}} _K$ is a stream cipher which maps an n-bit string under the key K to an output keystream. We will assume that the output keystream is of some fixed length which is sufficiently big for all practical applications. An appropriate length prefix of the output keystream is used in a particular context. We denote the nvMAC scheme built from ${\textsf {SC}} $ as ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $. The tag generation algorithm for the ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $ scheme is shown in Table 3. The verification algorithm ${\textsf {SC}} \text{- }{\textsf {nvMAC}} .{\textsf {Verify}} _K(N,x,{\textsf {tag}} ,\lambda )$ works as follows: compute ${\textsf {tag}} ^{\prime }={\textsf {SC}} \text{- }{\textsf {nvMAC}} .{\textsf {Gen}} _{K}(N,x,\lambda )$; return ${\textsf {true}} $ if ${\textsf {tag}} ={\textsf {tag}} ^{\prime }$, else return ${\textsf {false}} $.

The key space for ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $ is ${\mathcal {K}}$. The key generation algorithm consists of sampling K uniformly at random from ${\mathcal {K}}$.

The security of ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $ is given by the following result.

Theorem 2

In ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $ defined in Table 3, suppose that the hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ is $\varepsilon $-AXU, where $\varepsilon (\ell ,\ell ^{\prime }) \ge 1/2^n$ for all $\ell ,\ell ^{\prime }\le L$.

Fix a query profile ${\mathfrak {C}}$. For $\lambda \in {\mathcal {L}}$, let $q_{g,\lambda }$ (resp. $q_{v,\lambda }$) be the number of tag generation (resp. verification) queries for $\lambda $ which are in ${\mathfrak {C}}$. Let $q_g=\sum _{\lambda \in {\mathcal {L}}}q_{g,\lambda }$ and $q_v=\sum _{\lambda \in {\mathcal {L}}}q_{v,\lambda }$. Let $\sigma _g$ (resp. $\sigma _v$) be the total number of bits in all the tag generation (resp. verification) queries in ${\mathfrak {C}}$.

Let $\lambda $ be such that $q_{v,\lambda }\ge 1$ and for $1\le i\le q_{v,\lambda }$, let $Q_{v,\lambda }^{(i)}=(N_{v,\lambda }^{(i)},x_{v,\lambda }^{(i)},{\textsf {tag}} _{v,\lambda }^{(i)},\lambda )$ be the i-th verification query with tag length $\lambda $. Let $\ell _{v,\lambda }^{(i)}={\textsf {len}} (x_{v,\lambda }^{(i)})$. Corresponding to $Q_{v,\lambda }^{(i)}$, there is at most one tag generation query $Q_{g,\lambda }^{(i^{\star })}=(N_{g,\lambda }^{(i^{\star })},x_{g,\lambda }^{(i^{\star })},\lambda )$ such that $N_{v,\lambda }^{(i)}=N_{g,\lambda }^{(i^{\star })}$. Let $\ell _{g,\lambda }^{(i^{\star })}={\textsf {len}} (x_{g,\lambda }^{(i^{\star })})$ if there is such a $Q_{g,\lambda }^{(i^{\star })}$, otherwise $\ell _{g,\lambda }^{(i^{\star })}$ is undefined.

Fix $\lambda _0\in {\mathcal {L}}$. Let ${\mathcal {S}}_{\lambda _0}$ be the set of all queries made by the adversary other than the verification queries for tag length $\lambda _0$. Suppose that the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to at most q distinct (nonce, tag-length) values. Then

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {SC}} \text{- }{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0](t,{\mathfrak {C}})\le & {} {\textsf {Adv}} _{{\textsf {SC}} }^\mathrm{\small prf}(t+t^{\prime },q_g+q_v,n(q_g+q_v)) + \frac{1}{2^{\lambda _0}} \times \sum _{1\le i \le q_{v,\lambda _0}}\gamma _i\qquad \end{aligned}$$

(30)

where $\gamma _i=2^n\varepsilon (\ell _{v,\lambda _0}^{(i)},\ell _{g,\lambda _0}^{(i^{\star })})$ if there is a $Q_{g,\lambda _0}^{(i^{\star })}$ corresponding to $Q_{v,\lambda _0}^{(i)}$ with $N_{v,\lambda _0}^{(i)}=N_{g,\lambda _0}^{(i^{\star })}$; otherwise $\gamma _i=1$. Here $t^{\prime }$ is the time required to hash $q_v+q_g$ messages of total length at most $\sigma _g+\sigma _v$, plus some bookkeeping time.

Proof

The proof is similar to the proof of Theorem 1. We mention the differences.

The first reduction is to replace ${\textsf {SC}} _K$ by a uniform random function $\rho $ from $\{0,1\}^n$ to $\{0,1\}^L$. The advantage of the adversary in detecting this change is captured by the term ${\textsf {Adv}} _{{\textsf {SC}} }^\mathrm{\small prf}(t+t^{\prime },q_g+q_v,n(q_g+q_v))$ in (30). Let the scheme resulting from the replacement be denoted as $\rho \text{- }{\textsf {nvMAC}} $.

Since ${\textsf {SC}} _K$ has been taken care of, the ensuing analysis is information theoretic. Let ${\mathcal {A}}$ be a deterministic and computationally unbounded adversary attacking $\rho \text{- }{\textsf {nvMAC}} $ and having query profile ${\mathfrak {C}}$. It is required to upper bound ${\textsf {Adv}} _{\rho \text{- }{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0]({\mathcal {A}})$.

As in the proof of Theorem 1, the task reduces to analysing the probability of the event ${\textsf {succ}} ({\mathcal {A}}({\mathcal {T}}),\lambda _0)$ for a transcript ${\mathcal {T}}$ whose query profile is ${\mathfrak {C}}$.

The second reduction is to assume that $q_{v,\lambda _0} = 1$; the third reduction is to assume that all queries after the single verification query for tag length $\lambda _0$ are discarded. These reductions are also used in the proof of Theorem 1 and the justifications for these reductions in the present context are the same as those described in the proof of Theorem 1. As in Theorem 1, consider the set ${\mathcal {S}}_{\lambda _0}$ which consists of all queries made by ${\mathcal {A}}$ other than the verification queries for $\lambda _0$. Further, similar to the proof of Theorem 1, insert queries to the transcript ${\mathcal {T}}$, to ensure that the number of distinct (nonce, tag-length) pairs arising from the queries in ${\mathcal {S}}_{\lambda _0}$ is q.

In view of the above reductions, it is sufficient to consider an adversary ${\mathcal {A}}$ with a transcript ${\mathcal {T}}$ where the last query is the verification query $(N,x,{\textsf {tag}} ,\lambda _0)$ for tag length $\lambda _0$. Also, let $(N^{(1)},\lambda ^{(1)}),\ldots ,(N^{(q)},\lambda ^{(q)})$ be the distinct (nonce, tag-length) pairs arising from the queries in ${\mathcal {S}}_{\lambda _0}$. For $1\le i\le q$, define ${\mathfrak {N}}^{(i)}={\textsf {bin}} _8(\lambda ^{(i)}-1)||N^{(i)}$, $(Q^{(i)},\tau _i)=\rho ({\mathfrak {N}}^{(i)})$ (considering the full length output of $\rho $), ${\mathbf {Q}}=(Q^{(1)},\ldots ,Q^{(q)})$ and $\varvec{\tau }=(\tau _1,\ldots ,\tau _q)$. The entire randomness in the transcript arises from ${\mathbf {Q}}$ and $\varvec{\tau }$.

At this point, we would like to mention a small difference with the proof of Theorem 1. In the scheme ${\textsf {nvMAC}} $, the hash key depends upon the tag length, whereas in ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $, the hash key is determined by (nonce, tag-length) pair. As a consequence, the vector $\varvec{\tau }$ defined above has q components, while the vector $\varvec{\tau }$ defined in the proof of Theorem 1 has $q^{\prime }$ components, where $q^{\prime }$ is the number of distinct tag lengths arising from the queries in ${\mathcal {S}}_{\lambda _0}$.

Modulo this small difference, the rest of the proof is the same as the proof of Theorem 1. In particular, the proof divides into two cases. The first case is where the adversary does not make any previous tag generation query with (nonce, tag-length) pair equal to $(N,\lambda _0)$ and the second case is where the adversary does make such a query. The probability calculations for these two cases are almost the same as those in the proof of Theorem 1. The only difference is that in the present case, $\rho $ is uniform random function and so $\delta _j=1$. Using these values of $\delta _j$, the calculations done in the two cases of the proof of Theorem 1 show the bound stated in (30). $\square $

The following corollary provides the translation of Theorem 2 in terms of query complexity.

Corollary 2

In ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $ defined in Table 3, suppose that the hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ be such that for any distinct $x, x^{\prime } \in {\mathcal {M}}$ and any $y \in \{0,1\}^n$, $\Pr [{\textsf {Hash}} _{\tau }(x) \oplus {\textsf {Hash}} _{\tau }(x^{\prime }) = y]$ is at most $\varepsilon \ge 1/2^n$, i.e. $\varepsilon $ is the maximum of the differential probabilities for all combination of messages.

For $\lambda \in {\mathcal {L}}$, let $q_{g,\lambda }$ (resp. $q_{v,\lambda }$) be the number of tag generation (resp. verification) queries for $\lambda $. Let $q_g=\sum _{\lambda \in {\mathcal {L}}}q_{g,\lambda }$ and $q_v=\sum _{\lambda \in {\mathcal {L}}}q_{v,\lambda }$. Let $\sigma _g$ (resp. $\sigma _v$) be the total number of bits in all the tag generation (resp. verification) queries.

Fix $\lambda _0\in {\mathcal {L}}$. Let ${\mathcal {S}}_{\lambda _0}$ be the set of all queries made by the adversary other than the verification queries for tag length $\lambda _0$. Suppose that the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to at most q distinct (nonce, tag-length) values. Then

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {SC}} \text{- }{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0](t, \sigma _g, \sigma _v)\le & {} {\textsf {Adv}} _{{\textsf {SC}} }^\mathrm{\small prf}(t+t^{\prime },q_g+q_v,n(q_g+q_v)) + 2^{n-\lambda _0} q_{v,\lambda _0}\varepsilon .\qquad \end{aligned}$$

(31)

Here $t^{\prime }$ is the time required to hash $q_v+q_g$ messages of total length at most $\sigma _g+\sigma _v$, plus some bookkeeping time.

In the scheme ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $, the pair $(Q,\tau )$ is derived by applying the stream cipher to ${\textsf {bin}} _8(\lambda -1)||N$. Since a stream cipher produces a long enough keystream, a single application of ${\textsf {SC}} $ is sufficient to obtain the pair $(Q,\tau )$. Suppose that we wish to use a PRF ${\textsf {F}} $ whose output is an n-bit string (or, a short fixed length string). Clearly, then a single invocation of ${\textsf {F}} $ will not be sufficient to obtain the pair $(Q,\tau )$. The PRF ${\textsf {F}} $ will have to be invoked repeatedly to obtain an output bit string of desired length from which the pair $(Q,\tau )$ can be obtained.

Formally, we use a PRF family $\{{\textsf {F}} _{{\textsf {K}} }\}_{K\in {\mathcal {K}}}$, where for each $K\in {\mathcal {K}}$, ${\textsf {F}} _K:\{0,1\}^n\rightarrow \{0,1\}^n$. Similar to the case of ${\textsf {SC}} \text{- }{\textsf {nvMAC}} $, the hash family $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$, the nonce space ${\mathcal {N}}$, the message space ${\mathcal {M}}$, the set of allowed tag lengths ${\mathcal {L}}$ and the tag space remain the same as in the case of ${\textsf {nvMAC}} $. The key space for the scheme is ${\mathcal {K}}$. The key generation algorithm consists of sampling K uniformly at random from ${\mathcal {K}}$.

The tag generation algorithm of an nvMAC scheme built from the PRF ${\textsf {F}} $ is shown in Table 4 and is denoted as ${\textsf {F}} \text{- }{\textsf {nvMAC}} .{\textsf {Gen}} $. The verification algorithm ${\textsf {F}} \text{- }{\textsf {nvMAC}} .{\textsf {Verify}} (N,x,{\textsf {tag}} ,\lambda )$ works as follows. Given $(N,x,{\textsf {tag}} ,\lambda )$, compute ${\textsf {tag}} ^{\prime }={\textsf {F}} \text{- }{\textsf {nvMAC}} .{\textsf {Gen}} _{K}(N,x,\lambda )$; if ${\textsf {tag}} ={\textsf {tag}} ^{\prime }$, return ${\textsf {true}} $, else return ${\textsf {false}} $. In Table 4, ${\textsf {F}} $ is used in a counter type mode of operation which was proposed in [28].

Instantiation of ${\textsf {F}} $ may be done by a fixed output length PRF such as Siphash [1]. Alternatively, it can also be done using the encryption function $E_K(\cdot )$ of a block cipher. Since E is a bijection, the PRF assumption on $E_K(\cdot )$ does not hold beyond the birthday bound. While using $E_K(\cdot )$, it would have been better to perform the analysis under the assumption that $E_K(\cdot )$ is a pseudo-random permutation (PRP). This, however, is problematic. The key $\tau $ to the hash function is derived by applying $E_K(\cdot )$. Under the assumption that $E_K(\cdot )$ is a PRP, it would not be possible to assume that $\tau $ is uniformly distributed. The differential probability determining the AXU property of the hash function is computed based on uniform random $\tau $. So, if $\tau $ cannot be considered to be uniform random, the AXU property of the hash function cannot be invoked. As a result, the proof would not go through. On the other hand, up to the birthday bound, it is reasonable to assume that the encryption function of a secure block cipher behaves like a PRF.

Table 4 A secure and efficient nvMAC scheme using a short output length PRF

Full size table

The security of ${\textsf {F}} \text{- }{\textsf {nvMAC}} $ is given by the following result.

Theorem 3

In ${\textsf {F}} \text{- }{\textsf {nvMAC}} $ defined in Table 4, suppose that the hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ is $\varepsilon $-AXU, where $\varepsilon (\ell ,\ell ^{\prime }) \ge 1/2^n$ for all $\ell ,\ell ^{\prime }\le L$.

Fix a query profile ${\mathfrak {C}}$. For $\lambda \in {\mathcal {L}}$, let $q_{g,\lambda }$ (resp. $q_{v,\lambda }$) be the number of tag generation (resp. verification) queries for $\lambda $ which are in ${\mathfrak {C}}$. Let $q_g=\sum _{\lambda \in {\mathcal {L}}}q_{g,\lambda }$ and $q_v=\sum _{\lambda \in {\mathcal {L}}}q_{v,\lambda }$. Let $\sigma _g$ (resp. $\sigma _v$) be the total number of bits in all the tag generation (resp. verification) queries in ${\mathfrak {C}}$.

Let $\lambda $ be such that $q_{v,\lambda }\ge 1$ and for $1\le i\le q_{v,\lambda }$, let $Q_{v,\lambda }^{(i)}=(N_{v,\lambda }^{(i)},x_{v,\lambda }^{(i)},{\textsf {tag}} _{v,\lambda }^{(i)},\lambda )$ be the i-th verification query with tag length $\lambda $. Let $\ell _{v,\lambda }^{(i)}={\textsf {len}} (x_{v,\lambda }^{(i)})$. Corresponding to $Q_{v,\lambda }^{(i)}$, there is at most one tag generation query $Q_{g,\lambda }^{(i^{\star })}=(N_{g,\lambda }^{(i^{\star })},x_{g,\lambda }^{(i^{\star })},\lambda )$ such that $N_{v,\lambda }^{(i)}=N_{g,\lambda }^{(i^{\star })}$. Let $\ell _{g,\lambda }^{(i^{\star })}={\textsf {len}} (x_{g,\lambda }^{(i^{\star })})$ if there is such a $Q_{g,\lambda }^{(i^{\star })}$, otherwise $\ell _{g,\lambda }^{(i^{\star })}$ is undefined.

Fix $\lambda _0\in {\mathcal {L}}$. Let ${\mathcal {S}}_{\lambda _0}$ be the set of all queries made by the adversary other than the verification queries for tag length $\lambda _0$. Suppose that the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to at most q distinct (nonce, tag-length) values. Then

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {F}} \text{- }{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0](t,{\mathfrak {C}})\le & {} {\textsf {Adv}} _{{\textsf {F}} }^\mathrm{\small prf}(t+t^{\prime },B_g+B_v,n(B_g+B_v)) \nonumber \\&\quad +\,\frac{(B_g+B_v)^2}{2^n} + \frac{1}{2^{\lambda _0}} \times \sum _{1\le i \le q_{v,\lambda _0}}\gamma _i \end{aligned}$$

(32)

where

$\gamma _i=2^n\varepsilon (\ell _{v,\lambda _0}^{(i)},\ell _{g,\lambda _0}^{(i^{\star })})$ if there is a $Q_{g,\lambda _0}^{(i^{\star })}$ corresponding to $Q_{v,\lambda _0}^{(i)}$ with $N_{v,\lambda _0}^{(i)}=N_{g,\lambda _0}^{(i^{\star })}$; otherwise $\gamma _i=1$;
$b^{(i)}_{v,\lambda }={\mathfrak {b}}(x_{v,\lambda }^{(i)})$, $B_v=\sum \limits _{\lambda }\sum \limits _{1\le i\le q_{v,\lambda }} (b_{v,\lambda }^{(i)}+2)$;
$b^{(i)}_{g,\lambda }={\mathfrak {b}}(x_{g,\lambda }^{(i)})$, $B_g=\sum \limits _{\lambda }\sum \limits _{1\le i\le q_{g,\lambda }} (b_{g,\lambda }^{(i)}+2)$.

Here $t^{\prime }$ is the time required to hash $q_v+q_g$ messages of total length at most $\sigma _g+\sigma _v$, plus some bookkeeping time.

Proof

The proof is very similar to the proofs of Theorems 1 and 2. We briefly discuss the differences. There are two differences in the bound.

The first difference is in the number of queries to the PRF ${\textsf {F}} $ in the expression ${\textsf {Adv}} _{{\textsf {F}} }^\mathrm{\small prf}$. In the present case, if a query requires $b+1$ n-bit blocks to obtain the pair $(Q,\tau )$, the number of times ${\textsf {F}} $ is invoked is $b+2$. The rest of the analysis proceeds by replacing ${\textsf {F}} $ with a uniform random function $\rho $ from $\{0,1\}^n$ to $\{0,1\}^n$.

The main argument requires that for distinct values of $(N,\lambda )$, the random variables $(Q,\tau )$ are independent and uniformly distributed. The pair $(Q,\tau )$ is derived by successively applying $\rho $ to ${\textsf {S}} \oplus \textsf {bin} _n(1),\ldots ,{\textsf {S}} \oplus \textsf {bin} _n(b+1)$ where ${\textsf {S}} $ itself is obtained by applying $\rho $ to $\textsf {bin} _8(\lambda -1)||N$. If for distinct values of $(N,\lambda )$, the quantities ${\textsf {S}} ,{\textsf {S}} \oplus \textsf {bin} _n(1),\ldots ,{\textsf {S}} \oplus \textsf {bin} _n(b+1)$ are distinct, then the independent and uniform random distribution of $(Q,\tau )$ is ensured.

Let the q distinct values of (nonce, tag-length) pairs arising from the queries in ${\mathcal {S}}_{\lambda _0}$ be $(N^{(1)},\lambda ^{(1)}),$ $\ldots ,(N^{(q)},\lambda ^{(q)})$. Let ${\mathcal {D}}^{(i)}=\{{\textsf {S}} ^{(i)},{\textsf {S}} ^{(i)}\oplus \textsf {bin} _n(1),\ldots ,{\textsf {S}} ^{(i)}\oplus \textsf {bin} _n(b^{(i)}+1)\}$ be the set of random variables in the input of $\rho $ corresponding to $(N^{(i)},\lambda ^{(i)})$. Let ${\mathcal {D}}=\cup _{i=1}^q{\mathcal {D}}^{(i)}$ and so $\#{\mathcal {D}}\le B_g+B_v$. Let ${\textsf {bad}} $ be the event that any two of the variables in ${\mathcal {D}}$ are equal. Using the fact that $\rho $ is a uniform random function, it is standard to see that $\Pr [{\textsf {bad}} ]\le (B_g+B_v)^2/2^n$.

Let ${\mathcal {A}}$ be an adversary attacking the scheme where ${\textsf {F}} $ is replaced with $\rho $. We assume that ${\mathcal {A}}$ is deterministic and computationally unbounded. Let ${\textsf {succ}} ({\mathcal {A}})$ be the event that an adversary ${\mathcal {A}}$ is successful. Then

$$\begin{aligned} \Pr [{\textsf {succ}} ({\mathcal {A}})]\le & {} \Pr [{\textsf {bad}} ]+\Pr [{\textsf {succ}} ({\mathcal {A}})|\overline{{\textsf {bad}} }] \\\le & {} \frac{(B_g+B_v)^2}{2^n} + \Pr [{\textsf {succ}} ({\mathcal {A}})|\overline{{\textsf {bad}} }]. \end{aligned}$$

Conditioned on the event $\overline{{\textsf {bad}} }$, the pairs $(Q^{(i)},\tau ^{(i)})$ are independent and uniformly distributed. From this point onwards, the rest of the proof is exactly the same as the proof of Theorem 2 and provides the same bound. We skip these details. $\square $

The following corollary provides the translation of Theorem 3 in terms of query complexity.

Corollary 3

In ${\textsf {F}} \text{- }{\textsf {nvMAC}} $ defined in Table 4, suppose that the hash function $\{{\textsf {Hash}} _{\tau }\}_{\tau \in \Theta }$ be such that for any distinct $x, x^{\prime } \in {\mathcal {M}}$ and any $y \in \{0,1\}^n$, $\Pr [{\textsf {Hash}} _{\tau }(x) \oplus {\textsf {Hash}} _{\tau }(x^{\prime }) = y]$ is at most $\varepsilon \ge 1/2^n$, i.e. $\varepsilon $ is the maximum of the differential probabilities for all combination of messages.

For $\lambda \in {\mathcal {L}}$, let $q_{g,\lambda }$ (resp. $q_{v,\lambda }$) be the number of tag generation (resp. verification) queries for $\lambda $. Let $q_g=\sum _{\lambda \in {\mathcal {L}}}q_{g,\lambda }$ and $q_v=\sum _{\lambda \in {\mathcal {L}}}q_{v,\lambda }$. Let $\sigma _g$ (resp. $\sigma _v$) be the total number of bits in all the tag generation (resp. verification) queries.

Let $\lambda $ be such that $q_{g,\lambda }, q_{v,\lambda }\ge 1$ and for $1\le i\le q_{g,\lambda }$, let $Q_{g,\lambda }^{(i)}=(N_{g,\lambda }^{(i)},x_{g,\lambda }^{(i)},\lambda )$ be the i-th tag generation query with tag length $\lambda $; for $1\le i\le q_{v,\lambda }$, let $Q_{v,\lambda }^{(i)}=(N_{v,\lambda }^{(i)},x_{v,\lambda }^{(i)},{\textsf {tag}} _{v,\lambda }^{(i)},\lambda )$ be the i-th verification query with tag length $\lambda $.

Fix $\lambda _0\in {\mathcal {L}}$. Let ${\mathcal {S}}_{\lambda _0}$ be the set of all queries made by the adversary other than the verification queries for tag length $\lambda _0$. Suppose that the queries in ${\mathcal {S}}_{\lambda _0}$ give rise to at most q distinct (nonce, tag-length) values. Then

$$\begin{aligned} {\textsf {Adv}} _{{\textsf {F}} \text{- }{\textsf {nvMAC}} }^\mathrm{\small auth}[\lambda _0](t,\sigma _g,\sigma _v)\le & {} {\textsf {Adv}} _{{\textsf {F}} }^\mathrm{\small prf}(t+t^{\prime },B_g+B_v,n(B_g+B_v)) \nonumber \\&+ \frac{(B_g+B_v)^2}{2^n} + 2^{n-\lambda _0} \times q_{v,\lambda _0}\varepsilon , \end{aligned}$$

(33)

where $b^{(i)}_{v,\lambda }={\mathfrak {b}}(x_{v,\lambda }^{(i)})$, $b^{(i)}_{g,\lambda }={\mathfrak {b}}(x_{g,\lambda }^{(i)})$, $B_v=\sum \limits _{\lambda }\sum \limits _{1\le i\le q_{v,\lambda }} (b_{v,\lambda }^{(i)}+2)$ and $B_g=\sum \limits _{\lambda }\sum \limits _{1\le i\le q_{g,\lambda }} (b_{g,\lambda }^{(i)}+2)$. Here $t^{\prime }$ is the time required to hash $q_v+q_g$ messages of total length at most $\sigma _g+\sigma _v$, plus some bookkeeping time.

5 Conclusion

In this paper, we have considered the problem of constructing variable tag length MAC schemes. Several variants obtained from the Wegman-Carter MAC scheme have been shown to be insecure. One of these variants is proved to be secure. This scheme is extended to obtain constructions of single-key nonce-based variable tag length MAC schemes using either a stream cipher or a short-output PRF.

Notes

Bernstein’s proof in [5] for nonce-based MAC considers simulation of the first forgery attempt with the simulator returning ${\textsf {true}} $ if the provided tag is equal to the tag returned by a previous tag generation query on the same nonce and message, and ${\textsf {false}} $ otherwise. In our case, since we are disallowing useless queries, there could not have been a previous tag generation query for the tag length $\lambda _0$ with the same nonce and message as that of the first verification query for tag length $\lambda _0$. So, in our case, such a simulator would always return.${\textsf {false}} $.

References

Aumasson J.-P., Bernstein D.J.: Siphash: A fast short-input PRF. In: Galbraith S.D., Nandi M. (eds.) Progress in Cryptology - INDOCRYPT 2012, 13th International Conference on Cryptology in India, Kolkata, India, December 9–12, 2012. Proceedings, volume 7668 of Lecture Notes in Computer Science, pages 489–508. Springer, (2012).
Berbain C., Gilbert H.: On the security of IV dependent stream ciphers. In: Biryukov, Alex (ed.) FSE, volume 4593 of Lecture Notes in Computer Science, pp. 254–273. Springer, (2007).
Bernstein D.J.: The Salsa20 family of stream ciphers. http://cr.yp.to/papers.html#salsafamily. Document ID: 31364286077dcdff8e4509f9ff3139ad. Date: 2007.12.25.
Bernstein D.J.: The poly1305-aes message-authentication code. In: Gilbert H., Handschuh H. (eds.) Fast Software Encryption: 12th International Workshop, FSE 2005, Paris, France, February 21-23, 2005, Revised Selected Papers, volume 3557 of Lecture Notes in Computer Science, pp. 32–49. Springer, (2005).
Bernstein D.J.: Stronger security bounds for Wegman-Carter-Shoup authenticators. In: Cramer R. (ed.) EUROCRYPT, volume 3494 of Lecture Notes in Computer Science, pp. 164–180. Springer, (2005).
Bernstein D.J.: Polynomial evaluation and message authentication (2007). http://cr.yp.to/papers.html#pema.
Bernstein D.J., Chou T.: Faster binary-field multiplication and faster binary-field macs. In: Joux, Antoine, Youssef, Amr M. (eds), Selected Areas in Cryptography - SAC 2014 - 21st International Conference, Montreal, QC, Canada, August 14-15, 2014, Revised Selected Papers, volume 8781 of Lecture Notes in Computer Science, pp. 92–111. Springer, (2014).
Black J., Halevi S., Krawczyk H., Krovetz T., Rogaway P.: UMAC: Fast and secure message authentication. In: Wiener M.J. (ed.) CRYPTO, volume 1666 of Lecture Notes in Computer Science, pages 216–233. Springer, (1999).
CAESAR. Competition for Authenticated Encryption: Security, Applicability, and Robustness. http://competitions.cr.yp.to/caesar.html.
Chakraborty D., Ghosh S., Sarkar P.: A fast single-key two-level universal hash function. IACR Trans. Symmetric Cryptol. 2017(1), 106–128 (2017).
Article Google Scholar
Dobraunig C., Eichlseder M., Mendel F., Schläffer M.: Remark on variable tag lengths and OMD. https://groups.google.com/g/crypto-competitions/c/sekKDsIJvwU/m/5_V_TzZQaWYJ?pli=1. Accessed 15 Nov 2019 (2014).
Finney H.: CFRG discussion on UMAC. https://marc.info/?l=cfrg&m=143336318427069&w=2. Accessed 15 Nov 2019 (2005).
Finney H.: CFRG discussion on UMAC. https://marc.info/?l=cfrg&m=143336318527072&w=2. Accessed 15 Nov 2019 (2005).
Gilbert E.N., MacWilliams F., Jessie S., Neil J.A.: Codes which detect deception. Bell Syst. Tech. J. 53, 405–424 (1974).
Article MathSciNet Google Scholar
Krovetz T.: UMAC: Message authentication code using universal hashing. https://tools.ietf.org/html/draft-krovetz-umac-05.html. Accessed 15 Nov 2019, (2005).
Krovetz T., Rogaway P.: The software performance of authenticated-encryption modes. In: Joux, Antoine (ed.) FSE, volume 6733 of Lecture Notes in Computer Science, pages 306–327. Springer, (2011).
Luykx A., Preneel B.: Optimal forgeries against polynomial-based macs and GCM. In: Nielsen J.B., Rijmen V. (eds.) Advances in Cryptology - EUROCRYPT 2018 - 37th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tel Aviv, Israel, April 29 - May 3, 2018 Proceedings, Part I, volume 10820 of Lecture Notes in Computer Science, pages 445–467. Springer, (2018).
Manger J.H.: Attacker changing tag length in OCB. https://mailarchive.ietf.org/arch/msg/cfrg/gJtV9FCw92MguqqhxrSNUyIDZIw. Accessed 15 Nov 2019, (2013).
McGrew D.A., Viega J.: The security and performance of the Galois/Counter Mode (GCM) of operation. In: Canteaut, A., Viswanathan, K. (eds.) INDOCRYPT, volume 3348 of Lecture Notes in Computer Science, pages 343–355. Springer (2004).
Nandi M.: Bernstein bound on WCS is tight—repairing luykx-preneel optimal forgeries. In: Shacham H., Boldyreva A. (eds.) Advances in Cryptology - CRYPTO 2018 - 38th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 19–23, 2018, Proceedings, Part II, volume 10992 of Lecture Notes in Computer Science, pp. 213–238. Springer (2018).
Ounsworth M.: Footguns as an axis of security analysis. https://groups.google.com/a/list.nist.gov/forum/#!topic/pqc-forum/l2iYk-8sGnI. Accessed 15 Nov 2019, (2019).
Reyhanitabar R., Vaudenay S., Vizár D.: Authenticated encryption with variable stretch. In: Cheon J.H., Takagi T. (eds.) Advances in Cryptology—ASIACRYPT 2016 - 22nd International Conference on the Theory and Application of Cryptology and Information Security, Hanoi, Vietnam, December 4–8, 2016, Proceedings, Part I, volume 10031 of Lecture Notes in Computer Science, pp. 396–425 (2016).
Rogaway P., Wagner D.: A critique of ccm. Cryptology ePrint Archive, Report 2003/070, (2003). https://eprint.iacr.org/2003/070.
Safavi-Naini R., Lisý V., Desmedt Y.: Economically optimal variable tag length message authentication. In: Kiayias A. (ed.) Financial Cryptography and Data Security - 21st International Conference, FC 2017, Sliema, Malta, April 3–7, 2017, Revised Selected Papers, volume 10322 of Lecture Notes in Computer Science, pp. 204–223. Springer (2017).
Shoup V.: On fast and provably secure message authentication based on universal hashing. In: Koblitz, Neal (ed.) CRYPTO, volume 1109 of Lecture Notes in Computer Science, pages 313–328. Springer (1996).
UMAC. CFRG discussion on UMAC. http://marc.info/?l=cfrg&m=143336318427068&w=2. Accessed 15 Nov 2019, (2005).
Wagner D.: CFRG discussion on UMAC. https://marc.info/?l=cfrg&m=143336318527073&w=2. Accessed 15 Nov 2019, (2005).
Wang P., Feng D., Wenling W.: HCTR: A variable-input-length enciphering mode. In: Feng D., Lin D., Yung M. (eds.) CISC, volume 3822 of Lecture Notes in Computer Science, pp. 175–188. Springer, (2005).
Wegman Mark N.: Carter, Larry: New hash functions and their use in authentication and set equality. J. Comput. Syst. Sci. 22(3), 265–279 (1981).
Article Google Scholar
Winograd S.: A new algorithm for inner product. IEEE Trans. Comput. 17, 693–694 (1968).
Article Google Scholar

Download references

Acknowledgements

We are grateful to the reviewers for their detailed reading of the paper and for providing helpful comments.

Author information

Authors and Affiliations

Indian Statistical Institute, 203, B.T.Road, Kolkata, 700108, India
Sebati Ghosh & Palash Sarkar

Authors

Sebati Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Palash Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Palash Sarkar.

Additional information

Communicated by F. Mendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Attack on ${\textsf {nvMAC\text{- }t}} 6$

The attack is described in Algorithm 2.

Proposition 4

The attack given in Algorithm 2 on the scheme ${\textsf {nvMAC\text{- }t}} 6$ produces a forgery for tag length $\lambda $ which is correct with probability 1. It requires at most $2^{\lambda _1+1}+2^{n-\lambda _1}$ verification queries on tag length $\lambda _1$ and one tag generation query and at most $2^{n - \lambda _1}$ verification queries on tag length $\lambda $.

Proof

That the attack mentioned in Algorithm 2 forges with probability 1 is proved if it can be shown that there is an iteration of the do-while loop in Steps 17 to 24 such that ${\mathcal {R}}_v^{(5)} = {\textsf {true}} $, i.e. there is a verification query in Step 23 which succeeds.

From Steps 4 and 5, we get that

$$\begin{aligned} {\textsf {msb}} _{\lambda _1}({\textsf {F}} _{K}(N_1)\oplus {\textsf {Hash}} _{\tau _{\lambda _1}}(x_1))= & {} {\textsf {tag}} ^{(1)}. \end{aligned}$$

(34)

$$\begin{aligned} {\textsf {msb}} _{\lambda _1}({\textsf {F}} _{K}(N_1)\oplus {\textsf {Hash}} _{\tau _{\lambda _1}}(x_2))= & {} {\textsf {tag}} ^{(2)}. \end{aligned}$$

(35)

So,

$$\begin{aligned} {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _{\lambda _1}}(x_1)\oplus {\textsf {Hash}} _{\tau _{\lambda _1}}(x_2)) = {\textsf {tag}} ^{(1)} \oplus {\textsf {tag}} ^{(2)}. \end{aligned}$$

(36)

Here ${\textsf {tag}} ^{(1)} \oplus {\textsf {tag}} ^{(2)}$ is a $\lambda _1$-bit binary string.

Following Proposition 2, for each choice of $c_1$ in the do-while loop in Steps 17 to 24, the equation in Step 10 can be solved to get $\tau _{c_1}$ and $x_{c_1}$. The fact that ${\textsf {Hash}} _{\tau _{\lambda _1}}(x_1)\oplus {\textsf {Hash}} _{\tau _{\lambda _1}}(x_2) \in \{0,1\}^n$ and (36) suggest that there is a correct $c_1$, such that the equation in Step 10 holds and we consider that iteration of the do-while loop which deals with this particular $c_1$. The $\tau _{c_1}$ obtained in this iteration is the actual hash key used in the scheme. So,

$$\begin{aligned}&{\textsf {nvMAC\text{- }t}} 6(N_1,x_3,\lambda _1) \nonumber \\&\quad = {\textsf {msb}} _{\lambda _1}({\textsf {F}} _{K}(N_1)\oplus {\textsf {Hash}} _{\tau _{c_1}}(x_3)) \nonumber \\&\quad = {\textsf {tag}} ^{(1)} \oplus {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _{c_1}}(x_1)) \oplus {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _{c_1}}(x_3)) \end{aligned}$$

(37)

$$\begin{aligned}&\quad = x_{c_1} \oplus {\textsf {msb}} _{\lambda _1}({\textsf {Hash}} _{\tau _{c_1}}(x_3)) . \end{aligned}$$

(38)

The expression in (37) comes from (34) and that in (38) comes from Step 12 in Algorithm 2. Hence, in this particular iteration of the do-while loop, ${\mathcal {R}}_v^{(3)} = {\textsf {true}} $ and the loop terminates.

Noting that $\lambda =n$, from Step 15, we get

$$\begin{aligned} {\textsf {F}} _K(N_1)\oplus {\textsf {Hash}} _{\tau _{\lambda }}(x_4) = {\textsf {tag}} ^{(4)}\Rightarrow & {} {\textsf {Hash}} _{\tau _{\lambda }}(x_4) = {\textsf {tag}} ^{(4)} \oplus {\textsf {F}} _K(N_1). \end{aligned}$$

(39)

Here, the n bits of ${\textsf {tag}} ^{(4)}$ and ${\textsf {msb}} _{\lambda _1}(\cdot )$ of ${\textsf {F}} _K(N_1)$, which is $x_{c_1}$, are known. As ${\textsf {Hash}} _{\tau _{\lambda }}(x_4)\in \{0,1\}^n$, there is a $c_2\in \{0,1\}^{n-\lambda _1}$, such that,

$$\begin{aligned} {\textsf {Hash}} _{\tau _{\lambda }}(x_4) = {\textsf {msb}} _{\lambda _1}({\textsf {tag}} ^{(4)} \oplus {\textsf {F}} _K(N_1))||c_2 = ({\textsf {msb}} _{\lambda _1}({\textsf {tag}} ^{(4)}) \oplus x_{c_1}) || c_2. \end{aligned}$$

(40)

For the correct choice of $c_2$, the correct values of $\tau _{c_2}$ and $x_{c_2}$ are obtained in Steps 21 and 22 respectively. For the correct $c_2$, from (39) and (40), we get,

$$\begin{aligned} {\textsf {F}} _K(N_1) = {\textsf {Hash}} _{\tau _{\lambda }}(x_4) \oplus {\textsf {tag}} ^{(4)} = (({\textsf {msb}} _{\lambda _1}({\textsf {tag}} ^{(4)}) \oplus x_{c_1}) || c_2) \oplus {\textsf {tag}} ^{(4)}, \end{aligned}$$

(41)

which equals $x_{c_2}$ according to Step 22 in Algorithm 2. Hence,

$$\begin{aligned} {\textsf {nvMAC\text{- }t}} 6(N_1,x,\lambda )= & {} {\textsf {F}} _K(N_1)\oplus {\textsf {Hash}} _{\tau _{c_2}}(x) = x_{c_2} \oplus {\textsf {Hash}} _{\tau _{c_2}}(x). \end{aligned}$$

(42)

The last equality follows from (41). From (42), it is clear that for the iteration of the do-while loop in Steps 17 to 24, in which the correct $c_2$ is used, ${\mathcal {R}}_v^{(5)} = {\textsf {true}} $ with probability 1, which proves the first part of the Lemma.

Steps 4 and 5 each require at most $2^{\lambda _1}$ verification queries for tag length $\lambda _1$. Step 13 requires at most $2^{n-\lambda _1}$ verification queries for tag length $\lambda _1$. A tag generation query for tag length $\lambda $ is made in Step 15 and at most $2^{n-\lambda _1}$ verification queries are made for tag length $\lambda $ in Step 23. This shows the complexity of the attack. $\square $

Remarks

1.
With $\lambda =n$ suppose $\lambda _1=n/2$. Then the adversary makes a maximum of $3\cdot 2^{n/2}$ verification queries for tag length n/2, one tag generation query and at most $2^{n/2}$ verification queries for tag length n. It produces a forgery for tag length n which is correct with probability 1. So, this is a valid forgery attack for tag length n.
2.
Algorithm 2 makes a single tag generation query. Hence, the issue of repeating nonces in tag generation queries does not arise.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, S., Sarkar, P. Variants of Wegman-Carter message authentication code supporting variable tag lengths. Des. Codes Cryptogr. 89, 709–736 (2021). https://doi.org/10.1007/s10623-020-00840-w

Download citation

Received: 27 May 2020
Revised: 24 December 2020
Accepted: 26 December 2020
Published: 04 February 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s10623-020-00840-w

Keywords

Mathematics Subject Classification

94A60

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Variants of Wegman-Carter message authentication code supporting variable tag lengths

Abstract

Similar content being viewed by others

Secure Message Authentication Against Related-Key Attack

Non-adaptive Group-Testing Aggregate MAC Scheme

Full PRF-Secure Message Authentication Code Based on Tweakable Block Cipher

1 Introduction

1.1 Our contributions

1.2 Previous and related works

2 Definitions

2.1 Hash function

2.2 Pseudo-random function

2.3 Variable tag length nonce-based message authentication code

Remark

Proposition 1

3 Towards building a variable tag length MAC

Proposition 2

Proposition 3

Proof

Remarks

Remark

4 Secure and efficient MAC schemes with variable length tag

Theorem 1

Remark

Proof

Corollary 1

4.1 Reducing key size

Theorem 2

Proof

Corollary 2

Theorem 3

Proof

Corollary 3

5 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Attack on \({\textsf {nvMAC\text{- }t}} 6\)

Attack on \({\textsf {nvMAC\text{- }t}} 6\)

Proposition 4

Proof

Remarks

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation