A note on the chi-square method: A tool for proving cryptographic security

Bhattacharya, Srimanta; Nandi, Mridul

doi:10.1007/s12095-017-0276-z

A note on the chi-square method: A tool for proving cryptographic security

Published: 16 January 2018

Volume 10, pages 935–957, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Cryptography and Communications Aims and scope Submit manuscript

A note on the chi-square method: A tool for proving cryptographic security

Download PDF

Srimanta Bhattacharya¹ &
Mridul Nandi¹

516 Accesses
10 Citations
Explore all metrics

Abstract

Very recently (in CRYPTO 2017) Dai, Hoang, and Tessaro have introduced the Chi-square method (χ² method) which can be applied to obtain an upper bound on the statistical distance between two joint probability distributions. The authors have applied this method to prove the pseudorandom function security (PRF-security) of sum of two random permutations. In this work, we revisit their proof and find a non-trivial gap in the proof. We plug this gap for two specific cases and state the general case as an assumption whose proof is essential for the completeness of the proof by Dai et al.. A complete, correct, and transparent proof of the full security of the sum of two random permutations construction is much desirable, especially due to its importance and two decades old legacy. The proposed χ² method seems to have potential for application to similar problems, where a similar gap may creep into a proof. These considerations motivate us to communicate our observation in a formal way. On the positive side, we provide a very simple proof of the PRF-security of the truncated random permutation construction (a method to construct PRF from a random permutation) using the χ² method. We note that a proof of the PRF-security due to Stam is already known for this construction in a purely statistical context. However, the use of the χ² method makes the proof much simpler.

Full Indifferentiable Security of the Xor of Two or More Random Permutations Using the $$\chi ^2$$ Method

Populating the Zoo of Rugged Pseudorandom Permutations

Keyed Sum of Permutations: A Simpler RP-Based PRF

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Different tools from probability and statistics are now heavily used in different areas in cryptography. In this paper, we focus on a statistical tool, termed χ² method, which has been introduced by Dai, Hoang, and Tessaro in CRYPTO 2017 [7]. Although a method which is essentially similar to the χ² method is known in statistics (since 1978), we believe that the χ² method is new in the context of cryptography. In [7], this method has been used to show pseudorandom function security (PRF-security) of two well known constructions, namely sum of random permutations [1, 17, 21, 22] and encrypted Davis-Meyer (EDM) [5, 19]. Further, we feel that this method may help us to obtain tight (and simplified) proofs for certain constructions where proofs so far have evaded more classical methods, such as the H-coefficient method [20].

χ²Method. The distinguishing advantage of a family of keyed functions is bounded by the total variation (also known as statistical distance) between the output distribution of the family and the output distribution of a random function. Total variation between two probability distributions P₀ and P₁ over a sample space Ω, denoted d_TV(P₀,P₁), is defined as the half of L₁-norm $\| \mathbf {P}_{\mathbf {0}}- \mathbf {P}_{\mathbf {1}}\|_{1} := \sum _{x \in \mathrm {\Omega }} |\mathbf {P}_{\mathbf {0}}(x) - \mathbf {P}_{\mathbf {1}}(x)|$. In [7], the authors have revisited a variation of the additivity property of the KL divergence between two joint distributions. The authors have termed it χ² method. When P₀ and P₁ are joint distributions, this method provides an upper bound on ∥P₀ −P₁∥₁ based on the χ² distances between the conditional distributions of P₀ and P₁. Next, we recall the definition of χ² distance. In what follows, we use the convention that 0/0 = 0.

Definition 1

The χ² distance between distributions P₀ and P₁ (over a sample space Ω) with P₀ ≪P₁ (i.e., the support of P₀ is contained in the support of P₁) is defined as

$$d_{\chi^{2}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) := \sum\limits_{x \in \Omega} \frac{(\mathbf{P}_{\mathbf{0}}(x) - \mathbf{P}_{\mathbf{1}}(x))^{2}}{\mathbf{P}_{\mathbf{1}}(x)}.$$

χ² distance has its origin in mathematical statistics dating back to Pearson (see [18] for some history). It can be seen that χ² distance is not symmetric and hence it is not a metric. However, this is useful for bounding other metrics, e.g., total variation. In the following, we briefly describe the χ² method (see Section Appendix A for details and proof).

Let X = (X₁,…,X_q) and Y = (Y₁,…,Y_q) be two multivariate random variables taking values from Ω^q. In order to simplify the notation, we denote by X^i− 1 the joint random variable (X₁,…,X_i− 1). Let $\mathbf {P}_{{0}_{x_{1}, \ldots , x_{i-1}}} $ denote the conditional probability distribution of X_i given X₁ = x₁, …, X_i− 1 = x_i− 1. We similarly write $\mathbf {P}_{{1}_{x_{1},\ldots , x_{i-1}}}$ for the distribution of Y_i given $\mathsf {Y}_{1} = x_{1}$, …, Y_i− 1 = x_i− 1. Then the χ² method says

$$ d_{\text{TV}}(\mathsf{X}, \mathsf{Y}) \leq \left( {1\over 2} \sum\limits_{i = 1}^{q} \mathbf{Ex}[{\chi}^{2}(\mathsf{X}_{1}, \ldots, \mathsf{X}_{i-1})]\right)^{1\over 2}, $$

(1)

where $\chi ^{2}(x_{1}, \ldots , x_{i-1}) = d_{\chi ^{2}}(\mathbf {P}_{{0}_{x_{1},\ldots , x_{i-1}}}, \mathbf {P}_{{1}_{x_{1}, \ldots , x_{i-1}}})$ and for all x₁,…,x_i− 1, $\mathbf {P}_{{0}_{x_{1},\ldots , x_{i-1}}}\ll \mathbf {P}_{{1}_{x_{1}, \ldots , x_{i-1}}}$. Note that we need this condition to define $d_{\chi ^{2}}$.

XOR of Two Random Permutations. XOR or sum of two random permutations is a well known construction, proposed and studied by Hall et al. in [13], for conversion of pseudorandom permutations (PRPs) into pseudorandom functions (PRFs).^{Footnote 1} Given a permutation π : {0,1}ⁿ↦{0,1}ⁿ, the construction creates a function f : {0,1}^n− 1↦{0,1}ⁿ, defined as f(x) = π(0||x) ⊕ π(1||x). When π is chosen uniformly at random from Perm_n, the set of all permutations of {0,1}ⁿ, how well does f resemble (in a certain well defined sense) a random function with the same domain and range (a function chosen uniformly from the set of all functions from the domain to the range)? A satisfactory answer to this question remained elusive for over two decades. There have been attempts [1, 17, 21, 22] to prove information-theoretic security of the construction. However, the proofs either fell short of proving full security (to be made precise in the next section) of the construction [17] or were sketchy ([1]) or contained non-trivial gaps and were difficult to follow [21, 22] as has also been observed by the authors of [7].^{Footnote 2} Also, as a related problem, Cogliati, Lampe, and Patarin [4] gave weaker bounds for the case of the sum of at least three permutations. The XOR construction is important since it has been used to obtain some constructions achieving beyond birthday (or sometimes almost full) security (e.g., CENC [15], PMAC_Plus [3] and ZMAC [14]).

1.1 Main results in the paper

In [7], Dai et al. have used the χ² method to prove full security of the XOR construction (XOR of two random permutations). In this paper, we have a closer inspection of the proof and we find a non-trivial gap in it. The gap is due to incorrect equalities involving conditional expectations. We have also made an attempt to fix the proof and we have shown that the proof can be fixed under Asssumption 1 (stated in Section 4). We have proved the assumption for two special cases and left the general case as an open problem.

In this note, we communicate the above observation formally. This serves two purposes: (a) to motivate a flawless proof of this problem, especially owing to its importance and a two-decades old legacy, (b) to prevent these types of loopholes from creeping into the proofs involving the χ² method, especially since the method seems to have potential for application to similar problems.

Truncation of Random Permutation. Although the application (in [7]) of the χ² method to the XOR construction contains gap, this technique can be powerful for bounding PRF-security of other constructions. In fact, in [7], the authors applied this method to bound the PRF-security of the EDM (or encrypted Davis-Meyer) construction. In this note, we apply this technique to the truncated random permutation construction and obtain a very simple proof of the known tight bound on the PRF-security of the construction. This has been studied by Stam (in a statistical context) in 1978 [25] and later by many others (e.g., [1, 8,9,10, 13]). Stam’s proof technique is very close to the χ² method. However, the other proofs are very different and produce different results. The difference between the proof methods of the relevant results from [1, 8, 13] and [25] is discussed in [10]. Our proof approach is more modular and uses the χ² method explicitly. We discuss these very briefly in Remark 1 and Remark 2.

The PRF property of the truncated random permutation construction has recently been used in the key derivation for the AES-GCM, Counter based authenticated encryption constructions [11].

1.2 Organization of the paper

The rest of the paper is organized as follows. In the next section, we provide a brief overview of relevant security notions and the χ² method. There we also discuss the two constructions: XOR of two random permuations construction and trucated random permutation construction. Section 3 is devoted to the proof of Theorem 2. In Section 4, we discuss the proof, by Dai et al., of the full security of the XOR of two random permutation construction, where we also point out the gap in it. In Section 5, we provide the proofs of Assumption A1 for two specific cases. We conclude in Section 6 by remarking on an identical gap in the proof of full security of a related construction, termed XOR of two independent random permutations. Finally, in Appendix A, we provide a self-contained proof of the χ² method; essential ingredients of the proof is same as that of [7], however, we also cover the finer details (such as the proof of the Pinsker’s inequality).

2 Preliminaries

Notation and Convention

We use the short-hand notation X^t to denote a tuple (X₁,…,X_t). We also write ${\mathcal {S}}^{t}$ to denote the t-fold Cartesian product of the set $\mathcal {S}$ with itself. It will be clear from the context whether X^t means a t-tuple (when X is a tuple) or product set (when X is a set).

We use notations X,Y,Z etc. (possibly with suffix) to represent random variables over some sets. Following the above notational convention, X^t would represent a t-tuple of random variables or random vector (X₁,…,X_t). We use $\mathcal {E}, \mathcal {S}, \mathcal {T}$ etc. (possibly with suffix) to denote sets. $\mathcal {A}$ will always represent an adversary.

In this paper, we fix a positive integer n, and we denote 2ⁿ by N.

2.1 PRF-security definition

Pseudorandom function (PRF) is a very popular security notion in cryptography. While analyzing a message authentication code (MAC), we mostly study its PRF-security as it is a stronger notion than MAC. It has also been used to define encryption schemes, authenticated encryptions, and other cryptographic algorithms.

Now we formally define the PRF-advantage of an algorithm or a keyed function. By $\mathsf {X} \leftarrow _{\$} \mathcal {S}$ we mean that X is sampled uniformly from a finite set $\mathcal {S}$. Let m and p be positive integers. Let RP_m denote the random permutation chosen uniformly from Perm_m, the set of all permutations on {0,1}^m, i.e., RP_m←_$Perm_m. Similarly, let RF_m→p←_$Func_m→p (the set of all functions from {0,1}^m to {0,1}^p). Let $\mathcal {K}$ be a finite set (it is the key space of the construction). Given a function $f : \mathcal {K} \times \{0,1\}^{m} \to \{0,1\}^{p}$ and for every $k \in \mathcal {K}$, we denote f_k to represent the function (also called keyed function) f(k,⋅) ∈Func_m→p. We now define the PRF-advantage of an oracle adversary $\mathcal {A}$ against f as follows.

Definition 2 (PRF-advantage)

Let $\mathcal {A}$ be a distinguisher (oracle algorithm) and $f : {\mathcal {K}} \times \{0,1\}^{m} \to \{0,1\}^{p}$. Then, the PRF-advantage of $\mathcal {A}$ against f is defined as

$$\mathbf{Adv}^{\text{prf}}_{f}(\mathcal{A}) = | \textbf{P}[\mathcal{A}^{f_{\textit{\textsf{K}}}} \rightarrow 1 : \textit{\textsf{K}} \leftarrow_{\$} \mathcal{K}] - \textbf{P}[\mathcal{A}^{\textsf{RF}_{m \to p}} \rightarrow 1]|.$$

As we restrict to only deterministic keyed functions (i.e., functions which give same output on same input) we can assume, without loss of generality, that the adversary does not repeat its queries. In other words, if Q₁,…,Q_q are all queries then these are distinct. We can also assume that $\mathcal {A}$ is deterministic as it can always run with the best random coins which maximize the advantage. Suppose $\mathcal {A}$ makes q distinct queries adaptively, denoted Q₁,…,Q_q, and obtains responses U₁,…,U_q. So, when $\mathcal {A}$ in interacting with RF_m→p, the outputs are uniformly and independently distributed over {0,1}^p which we denote as U₁,…,U_q←_${0,1}^p.

Similarly, let X₁,…,X_q denote the outputs of f_K where $\textsf {K} \leftarrow _{\$} {\mathcal {K}}$. We denote the probability distributions associated with U₁,…,U_q and X₁,…,X_q by P₁ and P₀ respectively. Thus,

$$ {{\mathbf{Adv}}^{\text{prf}}_{f}}(\mathcal{A}) = |\mathbf{P}_{\mathbf{1}}({\mathcal{E}}) - \mathbf{P}_{\mathbf{0}}({\mathcal{E}})| $$

(2)

where ${\mathcal {E}}$ is the set of all q-tuple of responses x^q := (x₁,…,x_q) ∈ ({0,1}ⁿ)^q for which $\mathcal {A}$ returns 1. From the definition the total variation (also known as the statistical distance) between P₀ and P₁ is

$$ d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) \overset{\text{def}}{=} \frac{1}{2}\sum\limits_{x^{q} \in (\{0,1\}^{n})^{q}}|\mathbf{P}_{\mathbf{0}}(x^{q}) - \mathbf{P}_{\mathbf{1}}(x^{q})| = \max_{{\mathcal{E}} \subseteq \mathrm{\Omega}} (\mathbf{P}_{\mathbf{0}}({\mathcal{E}}) - \mathbf{P}_{\mathbf{1}}({\mathcal{E}})). $$

(3)

Hence,

$${{\mathbf{Adv}}^{\text{prf}}_{f}}(\mathcal{A}) \leq d_{\text{TV}}(\mathbf{P}_{\mathbf{1}}, \mathbf{P}_{\mathbf{0}}). $$

^{Footnote 3} Thus, the main cryptographic objective (that of determining the PRF-advantage ${{\mathbf {Adv}}^{\text {prf}}_{f}}(\mathcal {A})$) turns out to be a purely probability or statistical problem. Next, we discuss the χ² method which provides an upper bound of total variation between two joint distributions.

2.2 χ ² method

Let $\mathsf {X} := (\mathsf {X}_{1}, \ldots , \mathsf {X}_{q})$ and Z := (Z₁,…,Z_q) are two random vectors of size q distributed over Ω^q. Let us denote the probability distributions of X and Z as P₀ and P₁ respectively. We denote the conditional probability distributions as follows.

$$\begin{array}{@{}rcl@{}} \mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i}) & =& \mathbf{P}(\mathsf{X}_{i}= x_{i}| \mathsf{X}_{1} = x_{1}, \ldots, \mathsf{X}_{i-1} = x_{i-1}) \\ \mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i}) & =& \mathbf{P}(\mathsf{Z}_{i}= x_{i}| \mathsf{Z}_{1} = x_{1}, \ldots, \mathsf{Z}_{i-1} = x_{i-1}) \end{array} $$

When i = 1, $\mathbf {P}_{\mathbf {0}|x^{i-1}}(x_{1})$ represents $\mathbf {P}(\mathsf {X}_{1} = x_{1})$. Similarly, for $\mathbf {P}_{\mathbf {1}|x^{i-1}}(x_{1})$. Let $x^{i-1} \in \mathrm {\Omega }^{i-1}$, i ≥ 1. Let us denote the χ² distance between $\mathbf {P}_{\mathbf {0}|x^{i-1}}$ and $\mathbf {P}_{\mathbf {1}|x^{i-1}}$ as $\chi ^{2}(x^{i-1})$, i.e.,

$$\chi^{2}(x^{i-1}):= d_{\chi^{2}}(\mathbf{P}_{\mathbf{0}|x^{i-1}}, \mathbf{P}_{\mathbf{1}|x^{i-1}}). $$

Thus, χ² is a real valued function. The next theorem is the crux of the χ² method; it bounds the total variation between two joint distributions in terms of the χ² distance between the corresponding conditional distributions.

Theorem 1 ([7])

Suppose P₀ and P₁ denote probabilitydistributions ofX := (X₁,…,X_q)and$\mathsf {Z} := (\mathsf {Z}_{1}, \ldots , \mathsf {Z}_{q})$andfor all$x_{1}, \ldots , x_{i-1}$,we have$\mathbf {P}_{\mathbf {0}|x^{i-1}}\ll \mathbf {P}_{\mathbf {1}|x^{i-1}}$.Then

$$d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) \leq \left( {1\over 2} \sum\limits_{i = 1}^{q} \mathbf{Ex}[{\chi}^{2}(\mathsf{X}^{i-1})]\right)^{1\over 2}. $$

For the sake of completeness, we provide a complete proof of this theorem in Appendix A. In our setup, note that $\mathsf {Z}_{1}, \ldots , \mathsf {Z}_{q} \leftarrow _{\$} {\{0,1\}}^{p}$ for some p and hence $\mathbf {P}_{\mathbf {1}|x^{i-1}}(x_{i}) = \frac {1}{2^{p}}$ for all xⁱ. So,

$$\mathbf{Ex}[{\chi}^{2}(\mathsf{X}^{i-1})] = 2^{p}\sum\limits_{x_{i}} \mathbf{Ex}_{\mathsf{X}^{i-1}}\left[\left( \mathbf{P}(\mathsf{X}_{i}= x_{i}| \mathsf{X}_{1},\ldots, \mathsf{X}_{i-1}) - \frac{1}{2^{p}}\right)^{2}\right]. $$

In the following subsection, we describe two constructions for which this method was applied.

2.3 Two random permutation based constructions

In this paper, we mainly deal with two constructions based on a random permutation RP_n. Similar to a random function, if all queries to a random permutation RP_n are distinct and depends only on the previous responses (which is the case for an adversary), the outputs V₁,…,V_q behave like a random sample without replacement (WOR) from {0,1}ⁿ. We write $\mathsf {V}_{1}, \ldots , \mathsf {V}_{q} \gets _{\text {wor}} \{0,1\}^{n}$ to denote this. More formally, for all distinct$x_{1}, \ldots , x_{q} \in \{0,1\}^{n}$, $\textbf {P}(\mathsf {V}_{1} = x_{1}, \ldots \mathsf {V}_{q} = x_{q}) = \frac {1}{(N)_{q}}$, where (N)_q = N(N − 1)⋯(N − q + 1). Now, we briefly describe the constructions.(1) XORConstruction. Define XOR_π : {0,1}^n− 1 →{0,1}ⁿ to be the construction that takes a permutation π ∈Perm_nx ∈{0,1}^n− 1 it returns π(x∥0) ⊕ π(x∥1). Thus, XOR construction based on a random permutation RP_n returns X₁,…,X_q where X₁ :=V₁ ⊕V₂, …, X_q :=V_2q− 1 ⊕V_2q and V₁,…,V_2q←_wor{0,1}ⁿ.(2) trRPConstruction. Let m ≤ n and trunc_m denotes the truncation function which returns the first m bits of x ∈{0,1}ⁿ. Truncated random permutation is a composition of random permutation followed by a truncation function. More formally, we define for every x ∈{0,1}ⁿ,

$$\mathsf{trRP}_{m}(x) = \mathsf{trunc}_{m}(\textsf{RP}_{n}(x)).$$

Note that it is a function family, keyed by random permutation, mapping the set of all n-bit sequences to the set of all m-bit sequences. Let X₁,…,X_q denote the q outputs of trRP_m. Then X_i =trunc_m(V_i) for all i.

PRF-security of this construction has been studied by Stam in 1978, though in a much broader context (see [25] for details), and later by others (e.g., [1, 8,9,10, 13]). In particular, Stam proved the following statement.

Theorem 2 ([25])

LetV₁,…,V_q←_wor{0,1}ⁿ,U₁,…,U_q←_${0,1}^mandX_i =trunc_m(V_i)foralli.Then

$$d_{\text{TV}}(\mathsf{X}, \mathsf{U}) \leq {1\over 2}\left( \sum\limits_{i = 1}^{q} \frac{(M-1)q(q-1)}{(N-1)(N-q + 1)}\right)^{1\over 2} $$

whereX = (X₁,…,X_q)andU = (U₁,…,U_q).

The following corollary (though not proved by Stam) is immediate from the relationship between PRF-advantage and total variation.

Corollary 1

LetM = 2^m,N = 2ⁿandm ≤ n. For anyadversary$\mathcal {A}$makingqqueries wehave

$${{\mathbf{Adv}}^{\text{prf}}_{{\mathsf{trRP}}_{m}}}(\mathcal{A}) \leq {1\over 2}\left( \sum\limits_{i = 1}^{q} \frac{(M-1)q(q-1)}{(N-1)(N-q + 1)}\right)^{1\over 2}.$$

Remark 1

The upper bounds on the PRF-advantage of the trRP Construction given in [8, 13] are different (and weaker) than the one obtained by Stam. Although the bounds are similar for some choices of parameters. In [10], all these results are mentioned, and the proofs are briefly surveyed. In [10], a general tight lower bound on the PRF-advantage has been proved (improving on the lower bound declared in [13]).

3 Proof of theorem 2 using the χ ² method

Now we provide an alternative proof of Theorem 2 using the χ² method. We briefly recall the setup. Here V₁,…,V_q←_wor{0,1}ⁿ and X_i =trunc_m(V_i). Let x ∈{0,1}^m, i ≥ 1 be an integer, and K = N/M. Also, let H denote the number of j < i, for which trunc_m(V_j) = x. The probability distribution of H is well known as the hypergeometric distribution HG(N,K,i − 1). For every max(0,s + K − N) ≤ a ≤ min(K,s) we have

$$\mathbf{P}(\mathsf{H} = a) = \frac{{K \choose a} \times {N-K \choose s -a}}{{N \choose s}}.$$

The following fact states the expectation and variance formula of a hypergeometric distribution. Its proof can be found in standard probability theory text books and hence we skip it.

Fact 1

Let H follow hypergeometric distribution HG(N,K,s) and let p denote $\frac {K}{N}$. Then,

$$\begin{array}{@{}rcl@{}} \mathbf{Ex}[\mathsf{H}] &= & sp. \end{array} $$

(4)

$$\begin{array}{@{}rcl@{}} \mathbf{Var}[\mathsf{H}] := \mathbf{Ex}[\mathsf{H} - \mathbf{Ex}[\mathsf{H}]]^{2} &= & sp(1-p) \times \frac{N - s}{N-1}. \end{array} $$

(5)

As an aside, we mention that the factor $\frac {N-s}{N-1}$ is also known as the finite sampling correction factor. Up to this factor, the expression of variance is same as that of the binomial distribution.

Now, we apply the χ² method to bound the total variation d_TV(X,U), where U₁,…,U_q←_${0,1}^m. Let P₀ and P₁ denote the probability distributions of X and U respectively. Note that

$$\begin{array}{@{}rcl@{}} \mathbf{P}_{\mathbf{0}|x^{i-1}}(x) & =& \mathbf{P} (\mathsf{X}_{i} =x | \mathsf{X}_{1} = x_{1}, \ldots, \mathsf{X}_{i-1} = x_{i-1}) \\ & =& \mathbf{P} (\mathsf{V}_{i} \not\in {\mathcal{S}}), \text{where} \ {\mathcal{S}}_{i,x}(x^{i-1}) = \{v \in {\{0,1\}}^{n}: \exists j <i \mathsf{\ trunc}_{m}(v) = x_{j} \}\\ & =& \frac{{N \over M} - |{\mathcal{S}}_{i,x}(x^{i-1})|}{N - i + 1}. \end{array} $$

Let $N_{i,x}(x^{i-1}) := |{\mathcal {S}}_{i,x}(x^{i-1})|$ and H_i,x = N_i,x(X^i− 1). Then it is easy to see from the definition of the heypergeometric distribution that H_i,x follows HG(N,N/M,(i − 1)). Now, we compute the χ² function evaluated at x^i− 1.

$$\begin{array}{@{}rcl@{}} \chi^{2}(x^{i-1}) & = &\sum\limits_{x \in [M]} M\left( \frac{{N \over M} - N_{i,x}(x^{i-1})}{N - i + 1} - \frac{1}{M}\right)^{2} \\ & = &\sum\limits_{x \in [M]} \frac{M}{(N-i + 1)^{2}} \times \left( N_{i,x}(x^{i-1}) - \frac{i-1}{M}\right)^{2}. \end{array} $$

Hence,

$$\begin{array}{@{}rcl@{}} \mathbf{Ex}[\chi^{2}(\mathsf{X}^{i-1})] & = &\mathbf{Ex}\left[\sum\limits_{x \in [M]} \frac{M}{(N-i + 1)^{2}} \times \left( \mathsf{H}_{i,x} - \frac{i-1}{M}\right)^{2} \right] \\ & =& \sum\limits_{x \in [M]} \frac{M}{(N-i + 1)^{2}} \times \mathbf{Var}[\mathsf{H}_{i,x}]. \end{array} $$

(6)

This follows from the linearity of the expectation and the fact that Ex[H_i,x] = (i − 1)/M. By substituting the value of Var[N_x] as described in the Fact 1, we obtain

$$\begin{array}{@{}rcl@{}} \mathbf{Ex}[\chi^{2}(\mathsf{X}^{i-1})] & = &\frac{M^{2}}{(N-i + 1)^{2}} \times \frac{i-1}{M} \times \left( 1 - {1 \over M}\right) \times \frac{N-i + 1}{N-1} \\ & =& \frac{(M-1)(i-1)}{(N-1)(N-i + 1)}. \end{array} $$

Now by using Theorem 1 we have

$$\begin{array}{@{}rcl@{}} d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) & \leq &\left( {1\over 2} \sum\limits_{i = 1}^{q} \mathbf{Ex}[{\chi}^{2}(\mathsf{X}^{i-1})]\right)^{1\over 2} \\ & =& \left( {1\over 2}\sum\limits_{i = 1}^{q} \frac{(M-1)(i-1)}{(N-1)(N-i + 1)}\right)^{1\over 2} \\ &\leq& \left( {1\over 2}\sum\limits_{i = 1}^{q} \frac{(M-1)(i-1)}{(N-1)(N-q + 1)}\right)^{1\over 2} \\ &=& {1\over 2}\left( \sum\limits_{i = 1}^{q} \frac{(M-1)q(q-1)}{(N-1)(N-q + 1)}\right)^{1\over 2}. \end{array} $$

Remark 2

In order to draw comparison between our proof (using the χ² method) of Theorem 2 and the proof due to Stam, we remark that the main ideas of both the proofs are same; namely both use the chain rule of the KL divergence, concavity of the logarithm function, and also the hypergeometric distribution. However, unlike in our case (in (6)) Stam did not make explicit use of variance of the hypergeometric distribution. Instead, he used Jensen’s inequality. Moreover, our proof is simpler and modular compared to Stam’s proof with a more direct approach.

4 Overview of the proof by Dai et al. and its flaw

In this section, we provide a brief overview of the proof by Dai et al. to precisely point out the gap in their proof. In order to better emphasize, we provide a brief sketch of the proof due to Dai et al. We mostly follow the notation by the authors along with our notational convention. For example, we mostly use N instead of 2ⁿ. Moreover, for simplicity we write the set {0,1}ⁿ ∖{0ⁿ} as [N]^∗.

Theorem 3 ([7])

Fix an integern ≥ 8and letN = 2ⁿ. Forany adversary$\mathcal {A}$that makes$q \leq \frac {N}{32}$queries we have

$$\mathbf{Adv}^{\text{prf}}_{\mathsf{XOR}}(\mathcal{A}) \leq \frac{1.5q + 3 \sqrt{q}}{N}.$$

Proof due to Dai et al. in [7]

Let $\mathcal {A}$ be an adversary making exactly q distinct queries adaptively. As we have observed before, the output distributions of random function and XOR function do not depend on $\mathcal {A}$. In fact, $\mathsf {U}^{\prime }_{1}, \ldots , \mathsf {U}^{\prime }_{q} \leftarrow _{\$} \{0,1\}^{n}$ and X₁ :=V₁ ⊕V₂,…,X_q :=V_2q− 1 ⊕V_2q are the outputs of random function and XOR construction respectively, where V₁,…,V_2q←_wor{0,1}ⁿ. Let P₁ and P₂ denote the output distributions of X := (X₁,…,X_q) and $\mathsf {U}^{\prime } := (\mathsf {U}^{\prime }_{1}, \ldots , \mathsf {U}^{\prime }_{q})$ respectively. Thus,

$${{\mathbf{Adv}}^{\text{prf}}}_{\mathsf{XOR}}(\mathcal{A}) \leq d_{\text{TV}}(\mathbf{P}_{\mathbf{1}}, \mathbf{P}_{{2}}).$$

Now, we note that X_i’s cannot take 0ⁿ and hence it is natural to consider the q-tuple of random variables U₁,…,U_q←_$[N]^∗ := {0,1}ⁿ ∖{0ⁿ}. Let us denote by P₀ the probability distribution of U₁,…,U_q. By simple algebra, we have d_TV(P₀,P₂) ≤ q/2ⁿ. Also, using triangle inequality^{Footnote 4}, we have

$$\mathbf{Adv}^{\text{prf}}_{\mathsf{XOR}}(\mathcal{A}) \leq d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) + q/2^{n}.$$

At this point, the χ² method (i.e., Theorem 1) gives an upper bound on d_TV(P₀,P₁). The rest of the proof is devoted to show $d_{\text {TV}}(\mathbf {P}_{\mathbf {0}}, \mathbf {P}_{\mathbf {1}}) \leq \frac {0.5q + 3 \sqrt {q}}{2^{n}}$.

For every non-zero x₁,…,x_i, we clearly have $\mathbf {P}_{\mathbf {0}|x^{i-1}}(x_{i}) = 1/(N-1)$. For simplicity, let us denote by Y_i,x the conditional probability $\mathbf {P}_{\mathbf {1}}(x)$ which is also a function over x^i− 1. When x^i− 1 is chosen following the distribution of X^i− 1, we denote Y_i,x as Y_i,x. From the definition of χ² function corresponding to (V₁,…,V_q) and (U₁,…,U_q), we have

$$ \chi^{2}(x^{i-1}) = \sum\limits_{x \neq 0^{n}} (N-1)\left( Y_{i,x} - \frac{ 1}{N-1}\right)^{2}. $$

(7)

Now, we give a brief description of the rest but critical part of the proof where the authors provided an upper bound on Ex[χ²(X^i− 1)]. We keep the authors’ flow (suppressing some calculation which will be denoted as ∗∗∗) and wordings. However, we change some of their notations in order to make them consistent with our notation. Authors complete the proof as described below.

We now expand Y_i,x into a more expressive and convenient formula to work with. ∗∗∗ Let S = {V₁,V₂,…,V_2i− 2}. Let D_i,x be the number of pairs (u,u ⊕ x) such that both u and u ⊕ x belongs to S. Note that S and D_i,x are both random variables, and in fact functions of the random variables V₁,V₂,…,V_2i− 2. ∗∗∗ Hence,

$$ \mathsf{Y}_{i,x} = \frac{N- 4(i-1)+ \mathsf{D}_{i,x}}{(N- 2i + 1)(N- 2i)}.\\ $$

(8)

$$\begin{array}{cc} ***\\ \left( \mathsf{Y}_{i,x} - \frac{ 1}{N-1}\right)^{2} \leq \frac{3(\mathsf{D}_{i,x}- 4(i-1)^{2}/N)^{2} + 18}{N^{4}}. \end{array} $$

From (7),

$$\begin{array}{@{}rcl@{}} \mathbf{Ex}[\chi^{2}(\mathsf{X}^{i-1})] & \leq &\sum\limits_{x \neq 0^{n}} N \cdot \mathbf{Ex}\left[\left( \mathsf{Y}_{i,x} - \frac{ 1}{N-1}\right)^{2}\right] \end{array} $$

(9)

$$\begin{array}{@{}rcl@{}} & \leq &\sum\limits_{x \neq 0^{n}} \frac{18}{N^{3}} + \frac{3}{N^{3}} \cdot \mathbf{Ex}\left[\left( \mathsf{D}_{i,x} - \frac{4(i-1)^{2}}{N}\right)^{2}\right] \end{array} $$

(10)

In the last formula, it is helpful to think of each D_i,x as a function of V₁,V₂,…,V_2i− 2, and the expectation is taken over the choices of V₁,V₂,…,V_2i− 2 sampled uniformly without replacement from {0,1}ⁿ. We will show that^{Footnote 5} for any x ∈{0,1}ⁿ ∖{0ⁿ},

$$ \mathbf{Ex}\left[\left( \mathsf{D}_{i,x} - \frac{4(i-1)^{2}}{N}\right)^{2}\right] \leq \frac{4(i-1)^{2}}{N} $$

(11)

and thus

$$\mathbf{Ex}[\chi^{2}(\mathsf{X}^{i-1})] \leq \frac{18}{N^{2}} + \frac{12(i-1)^{2}}{N^{3}}.$$

Summing up, from χ²-method

$$\begin{array}{@{}rcl@{}} d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) & \leq \left( {1\over 2} \sum\limits_{i = 1}^{q} \mathbf{Ex}[{\chi}^{2}(\mathsf{X}^{i-1})]\right)^{1\over 2} \\ & \leq \frac{3\sqrt{q}+ .5q}{N}. \end{array} $$

4.1 Flaw in the above proof and its repair under an assumption

Let us revisit (8). Let us fix distinct v₁,…,v_2i− 2 and define the set ${\mathcal {S}} = \{v_{1}, \ldots , v_{2i-2} \}$. Let D_i,x denote the number of pairs (u,u ⊕ x) such that both u and u ⊕ x belong to $\mathcal {S}$. Let x₁ = v₁ ⊕ v₂,…,x_i− 1 = v_2i− 3 ⊕ v_2i− 2. Now, it is easy to see that

$$ \mathbf{P}(\mathsf{X}_{i} = x|\mathsf{V}_{1} = v_{1}, \ldots, \mathsf{V}_{2i-2} = v_{2i-2}) = \frac{N- 4(i-1)+ D_{i,x}}{(N- 2i + 1)(N- 2i)} $$

(12)

which appeared in the right hand side of (8). Whereas the left hand side of the equation is P(X_i = x|X₁ = x₁,…,X_i− 1 = x_i− 1). Note that in general,

$$ \mathbf{P}(\mathsf{X}_{i} = x|\mathsf{V}_{1} = v_{1}, \ldots, \mathsf{V}_{2i-2} = v_{2i-2}) = \mathbf{P}(\mathsf{X}_{i} = x|\mathsf{X}_{1} = x_{1}, \ldots, \mathsf{X}_{i-1} = x_{i-1}) $$

(13)

does not hold for everyv₁,…,v_2i− 2. Hence (8) is incorrect.

After observing this flaw in the proof, let us see how we can fix it. If we can prove (10) in some other way, we can still continue with the rest of the proof. This can be proved if we can prove a variant of the (8) as follows:

$$\begin{array}{@{}rcl@{}} \sum\limits_{x} \mathbf{Ex}[(\mathsf{Y}_{i,x} - c)^{2}] = \sum\limits_{x} \mathbf{Ex}\left[\left( \frac{N- 4(i-1)+ \mathsf{D}_{i,x}}{(N- 2i + 1)(N- 2i)} -c\right)^{2}\right], \end{array} $$

where c = 1/(N − 1). In other words,

$$\begin{array}{@{}rcl@{}}\ \sum\limits_{x} \mathbf{Ex}\left[(\mathbf{P}(\mathsf{X}_{i} = x|\mathsf{X}^{i-1}) - c)^{2}\right] = \sum\limits_{x} \mathbf{Ex}\left[(\mathbf{P}(\mathsf{X}_{i} = x|\mathsf{V}^{2i-2}) -c)^{2}\right]. \end{array} $$

The above equation is equivalent to

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{V}^{2i-2}}{\mathbf{Ex}}\left[\sum\limits_{x \in [N]^*}\mathbf{P}(\mathsf{X}_{i} = x|\mathsf{V}^{2i-2})^{2}\right] = \underset{\mathsf{X}^{i-1}}{\mathbf{Ex}}\left[\sum\limits_{x \in [N]^*}\mathbf{P}(\mathsf{X}_{i} = x|\mathsf{X}^{i-1})^{2}\right] \end{array} $$

(14)

We will show in the subsequent section that (14) is not actually true. This in fact shows that the (13) is actually an inequality for certain choices of v_i’s. However, the proof can still survive if we have the following weaker statement which we place here as an assumption.

Assumption 1

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{V}^{2i-2}}{\mathbf{Ex}}\left[\sum\limits_{x \in [N]^*}\mathbf{P}(\mathsf{X}_{i} = x|\mathsf{V}^{2i-2})^{2}\right] \geq \underset{\mathsf{X}^{i-1}}{\mathbf{Ex}}\left[\sum\limits_{x \in [N]^*}\mathbf{P}(\mathsf{X}_{i} = x|\mathsf{X}^{i-1})^{2}\right] \end{array} $$

(15)

Under this assumption, we can justify the inequalities appearing in (9) and (10). Hence, we can complete the proof of PRF-security of XOR construction under the above assumption. In the following section, we study the assumption for i = 2 and i = 3.

5 On the correctness of Assumption 1

In this section, we study the correctness of our assumption stated in the previous section for some small choices of i, namely for i = 2 and 3.

Theorem 4

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{X}_{1}}{\mathbf{Ex}}\left[\sum\limits_{x_2 \in [N]^*}\mathbf{P}(\mathsf{X}_{2} = x_2| \mathsf{X}_{1})^{2}\right] = \underset{\mathsf{V}^{2}}{\mathbf{Ex}}\left[\sum\limits_{x_2 \in [N]^*}\mathbf{P}(\mathsf{X}_{2} = x_2|\mathsf{V}^{2})^{2}\right]. \end{array} $$

(16)

Proof

For notational simplicity we will write [N]^∗ to represent the set {1,2,…,N − 1}. Now, we evaluate the two sides of (16).

L.H.S. of (16): Expanding the l.h.s. of (16) we get

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{X}_{1}}{\mathbf{Ex}}\left[\sum\limits_{x_2 \in [N]^*}\mathbf{P}(\mathsf{X}_{2}=x_2| \mathsf{X}_{1})^{2}\right] = \sum\limits_{x_1} \sum\limits_{x_2 \in [N]^*} {\mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{X}_{1}=x_1)^{2} \over \textbf{P}(\mathsf{X}_{1}=x_{1})}. \end{array} $$

(17)

Now, it follows that

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{1}=x_1) &=& \mathbf{P}(\mathsf{V}^{2}=v^{2}| \mathsf{V}_{1}+\mathsf{V}_{2}=x_1)\\ &=& {N \over N(N-1)}\\ &=& {1 \over N-1}. \end{array} $$

Next, we split the sum in (17) into the following two subcases: (i) Case 1.1, where we consider the condition x₂ = x₁, and (ii) Case 1.2, where we consider the condition x₂≠x₁. In each of the subcases, we determine the number of tuples (x₂,x₁) and probability P(X₂ = x₂,X₁ = x₁) of a particular tuple satisfying the conditions of the subcase.

Case 1.1: (x₂ = x₁).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_2, x_1)| x_2 = x_1 \}\rvert = N-1.\\ \end{array} $$

Next, we have

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{X}_{1}=x_1) &=& \mathbf{P}(\mathsf{V}^{4}| \mathsf{V}_{1}+\mathsf{V}_{2} = x_1 =x_2= \mathsf{V}_{3} +\mathsf{V}_{4})\\ &=& {N(N-2)\over N(N-1)(N-2)(N-3)} \\ &=& {1 \over (N-1)(N-3)}. \end{array} $$

Therefore,

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \sum\limits_{{x_1, x_2 \in [N]^* \atop x_2 = x_1}}{\mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{V}^{2}=v^{2})^{2} \over \mathbf{P}(\mathsf{V}^{2}=v^{2})}&= {1 \over (N-3)^{2}}. \end{array} \end{array} $$

(18)

Case 1.2: (x₂≠x₁).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_2, x_1)| x_2 \neq x_1 \}\rvert = (N-2)(N-3). \end{array} $$

Now,

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{X}_{1}=x_1) &=& \mathbf{P}(\mathsf{V}^{4}| \mathsf{V}_{1}+\mathsf{V}_{2} = x_1 \neq x_2= \mathsf{V}_{3} +\mathsf{V}_{4})\\ &=&{N(N-4)\over N(N-1)(N-2)(N-3)} \\ &=& {N-4 \over (N-1)(N-2)(N-3)}. \end{array} $$

Hence,

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \sum\limits_{{x_1, x_2 \in [N]^* \atop x_2 \neq x_1}}{\mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{X}_{1}=x_1)^{2} \over \mathbf{P}(\mathsf{X}_{1}=x_1)}&= {(N-4)^{2} \over (N-2)(N-3)^{2}}. \end{array} \end{array} $$

(19)

From (18) and (19) we get

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \underset{\mathsf{X}_{1}}{\mathbf{Ex}}[\sum\limits_{x_2 \in [N]^*}\mathbf{P}(\mathsf{X}_{2}=x_2| \mathsf{X}_{1})^{2}] &= {1 \over (N-3)^{2}}+{(N-4)^{2} \over (N-2)(N-3)^{2}}. \end{array} \end{array} $$

(20)

R.H.S. of (16): Expanding the r.h.s. of (16) we get

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \underset{\mathsf{V}^{2}}{\mathbf{Ex}}[\sum\limits_{x_2 \in [N]^*}\mathbf{P}(\mathsf{X}_{2}=x_2| \mathsf{V}^{2})^{2}] &= \sum\limits_{v^{2}} \sum\limits_{x_2 \in [N]^*}{\mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{V}^{2}=v^{2})^{2} \over \mathbf{P}(\mathsf{V}^{2}=v^{2})}. \end{array} \end{array} $$

(21)

Next, it follows that

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}^{2}= v^{2}) &= {1 \over N(N-1)}. \end{array} $$

Similar to the l.h.s., we split the sum in the r.h.s of (21) depending on the conditions (i) (Case 2.1), and (ii) x₂ = v₁ + v₂ (iii) x₂≠v₁ + v₂ (Case 2.2). In each of the subcases, we determine the number of tuples (x₂,v²) and probability P(X₂ = x₂,V² = v²) of a particular tuple satisfying the conditions of the subcase.

Case 2.1: (x₂ = v₁ + v₂).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_2,v^{2}) | x_2 =v_1+v_2 \}\rvert = N(N-1).\\ \end{array} $$

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{V}^{2}=v^{2}) &=& \mathbf{P}(\mathsf{V}^{4}=v^{4}| \mathsf{V}_{1}+\mathsf{V}_{2} = \mathsf{V}_{3} +\mathsf{V}_{4}= x_2)\\ &=&{(N-2)\over N(N-1)(N-2)(N-3)} \\ &=& {1 \over N(N-1)(N-3)}. \end{array} $$

So,

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \sum\limits_{v^{2}} \sum\limits_{{x_2 \in [N]^* \atop x_2 = v_1+v_2}} {\mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{V}^{2}=v^{2})^{2} \over \mathbf{P}(\mathsf{V}^{2}=v^{2})} &={1 \over (N-3)^{2}}. \end{array} \end{array} $$

(22)

Case 2.2: (x₂≠v₁ + v₂).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_2,v^{2}) | x_2 \neq v_1+v_2 \}\rvert = N(N-1)(N-2). \end{array} $$

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{V}^{2}=v^{2}) &&= \mathbf{P}(\mathsf{V}^{4}=v^{4}| \mathsf{V}_{1}+\mathsf{V}_{2} \neq x_2= \mathsf{V}_{3} +\mathsf{V}_{4})\\ &&= {N(N-4)\over N(N-1)(N-2)(N-3)} \\ &&= {N-4 \over (N-1)(N-2)(N-3)}. \end{array} $$

Therefore,

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \sum\limits_{{v^{2}}} \sum\limits_{{x_2 \in [N]^* \atop x_2 \neq v_1+v_2}} {\mathbf{P}(\mathsf{X}_{2}=x_2, \mathsf{V}^{2}=v^{2})^{2} \over \mathbf{P}(\mathsf{V}^{2}=v^{2})} &={(N-4)^{2} \over (N-2)(N-3)^{2}}. \end{array} \end{array} $$

(23)

From (22) and (23) we get

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \underset{\mathsf{V}^{2}}{\mathbf{Ex}}[\sum\limits_{x_2 \in [N]^*}\mathbf{P}(\mathsf{X}_{2}= x_2| \mathsf{V}^{2})^{2}] &= {1 \over (N-3)^{2}}+{(N-4)^{2} \over (N-2)(N-3)^{2}}. \end{array} \end{array} $$

(24)

The theorem follows by comparing (20) and (24).

So we have shown that our assumption is valid for i = 2 (in fact these are equal). Now we show that the assumption is still valid for i = 3. Here we have strict inequality.

Theorem 5

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{X}^{2}}{\mathbf{Ex}}[\sum\limits_{x_3 \in [N]^*}\mathbf{P}(\mathsf{X}_{3}=x_3|\mathsf{X}^{2})^{2}] < \underset{\mathsf{V}^{4}}{\mathbf{Ex}}[\sum\limits_{x_3 \in [N]^*}\mathbf{P}(\mathsf{X}_{3}=x_3| \mathsf{V}^{4})^{2}]. \end{array} $$

(25)

Proof

As in the proof of the previous theorem, we consider the two sides of (25) separately. However, in this proof, the calculations are more involved. In order to help the reader follow the steps of the proof, we show the proof structure in Fig. 1. In the figure, the root node corresponds to (25), and its two children correspond to calculation of the two sides of (25). In the remaining nodes, we show the conditions corresponding to the subcases.

L.H.S. of (25): Expanding the l.h.s. of (25) we get

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \underset{\mathsf{X}^{2}}{\mathbf{Ex}}[\sum\limits_{x_3 \in [N]^*}\mathbf{P}(\mathsf{X}_{3}=x_3|\mathsf{X}^{2})^{2}] &= \sum\limits_{x^{2}} \sum\limits_{x_3 \in [N]^*} {\textbf{P}(\mathsf{X}_{3}=x_{3},\mathsf{X}^{2}=x^{2})^{2} \over \textbf{P}(\mathsf{X}^{2}=x^{2})}. \end{array} \end{array} $$

(26)

We split the outer sum in the r.h.s. of (26) depending on the conditions (i) x₁ = x₂ (Case 1.1), and (ii) x₁≠x₂ (Case 1.2). For each of these subcases and subcases of these subcases, we determine the number of tuples (x₃,x²) and the probability P(X₃ = x₃,X² = x₂) of a particular tuple satisfying the conditions of the subcase.

Case 1.1: (x₁ = x₂).

In this case, we have

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}^{2}=x^{2}) &= \mathbf{P}(\mathsf{V}^{4}=v^{4}| \mathsf{V}_{1}+ \mathsf{V}_{2} =x_1 =x_2= \mathsf{V}_{3}+ \mathsf{V}_{4} )\\ &= {N(N-2)\over N(N-1)(N-2)(N-3)}\\ &= {1 \over (N-1)(N-3)}. \end{array} \end{array} $$

Next, we consider the following two subcases of this case: (i) Case 1.1.1, where we consider the condition x₃ = x₁, and (ii) Case 1.1.2, where we consider the condition x₃≠x₁.

Case 1.1.1: (x₃ = x₁ = x₂).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_3, x_1, x_2)| x_3 = x_1 =x_2 \}\rvert = N-1. \end{array} $$

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}_{1}=x_1, \mathsf{X}_{2}=x_2) &= \mathbf{P}(\mathsf{V}^{6}=v^{6}|\mathsf{V}_{1}+ \mathsf{V}_{2} =\mathsf{V}_{3}+ \mathsf{V}_{4} = \mathsf{V}_{5}+\mathsf{V}_{6}=x_3)\\ &= {N(N-2)(N-4)\over N(N-1)(N-2)(N-3)(N-4)(N-5)}\\ &= {1 \over (N-1)(N-3)(N-5)}. \end{array} \end{array} $$

Case 1.1.2:(x₃≠x₁ = x₂)

$$\begin{array}{@{}rcl@{}}\lvert\{(x_3,x^{2})| x_3 \neq x_1 =x_2 \}\rvert = (N-1)(N-2). \end{array} $$

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) = \mathbf{P}(&\mathsf{V}^{6}=v^{6}|x_1=\mathsf{V}_{1}+ \mathsf{V}_{2} =\mathsf{V}_{3}+ \mathsf{V}_{4}=x_2 \neq x_3= \mathsf{V}_{5}+\\ &\mathsf{V}_{6} ). \end{array} \end{array} $$

(27)

In order to calculate the probability on the r.h.s. of (112), we first observe that for fixed x₁, number of possible choices for (v₁,v₂) = N. Since v₃,v₄∉{v₁,v₂}, so, number of possible choices for (v₃,v₄) = N − 2. Further, we also require v₅,v₆∉{v₁,v₂,v₃,v₄}. This is equivalent to the condition that v₅∉S = {v₁,v₂,v₃,v₄}∪{v₁ + x₃,v₂ + x₃,v₃ + x₃,v₄ + x₃}. Therefore, in order to determine the number of possible choices of (v₅,v₆) we need to calculate the cardinality of the set S.

Here, we observe that in order to calculate the cardinality of S, it is enough to determine |{v₁}∩{v₁ + x₃,v₂ + x₃,v₃ + x₃,v₄ + x₃}|. It is clear that v₁∉{v₁ + x₃,v₂ + x₃}. Hence, we are left with the following two subcases: Case 1.1.2.a and Case 1.1.2.b.

Case 1.1.2.a:(v₁ ∈{v₃ + x₃,v₄ + x₃}). We have the following two possibilities.
1. 1.
  v₁ = v₃ + x₃: In this case,it also follows that v₂ = v₄ + x₃. So, |S| = 4.
2. 2.
  v₁ = v₄ + x₃: Similar to the above subcase. Hence, |S| = 4.
Case 1.1.2.b:(v₁∉{v₃ + x₃,v₄ + x₃}). In this case, |S| = 8.

Therefore, out of total N(N − 2) choices of v⁴, for 2N choices , we have that v₁ ∈{v₃ + x₃,v₄ + x₃}. For these 2N choices of v⁴, number of possible choices of (v₅,v₆) = N − 4. For the remaining N(N − 4) choices of v⁴, number of possible choices of (v₅,v₆) = N − 8. Hence,

$$\begin{array}{@{}rcl@{}}\begin{array}{ll} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) &= {2N(N-4)+N(N-4)(N-8) \over N(N-1)(N-2)(N-3)(N-4)(N-5)}\\ &= {N-6\over (N-1)(N-2)(N-3)(N-5)}. \end{array} \end{array} $$

Summing up the cases 1.1.1 and 1.1.2, we get

$$\begin{array}{@{}rcl@{}} \sum\limits_{{x^{2}\atop{x_1 = x_2}}} \sum\limits_{x_3 \in [N]^*} {\mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2})^{2} \over \mathbf{P}(\mathsf{X}^{2}=x^{2})} &=&{(N-1)^{2}(N-3) \over ((N-1)(N-3)(N-5))^{2}} \\ &&+{(N-1)^{2}(N-2)(N-3)(N-6)^{2}\over ((N-1)(N-2)(N-3)(N-5))^{2}} \\ &=&{1 \over (N-3)(N-5)^{2}} \\ &&+{(N-6)^{2} \over (N-2)(N-3)(N-5)^{2}}. \end{array} $$

(28)

Case 1.2:(x₁≠x₂).

$$\begin{array}{@{}rcl@{}}\begin{array}{ll} \mathbf{P}(\mathsf{X}^{2}=x^{2}) &= \mathbf{P}(\mathsf{V}^{4}=v^{4}|\mathsf{V}_{1}+ \mathsf{V}_{2}= x_1 \neq x_2= \mathsf{V}_{3}+ \mathsf{V}_{4}). \end{array} \end{array} $$

Here, we have that for fixed x₁ the number of possible choices of (v₁,v₂) = N. Now, v₃∉{v₁,v₂}∪{v₁ + x₂,v₂ + x₂}. Also, {v₁,v₂}∩{v₁ + x₂,v₂ + x₂} = ∅. Therefore, the number of possible choices of (v₃,v₄) is N − 4. So,

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}^{2}=x^{2}) &= {N-4 \over (N-1)(N-2)(N-3)}. \end{array} \end{array} $$

Next, we consider the following three subcases depending on (i) x₃ = x₁ (Case 1.2.1), (ii) x₃ = x₂ (Case 1.2.2), and (iii) x₃∉{x₁,x₂} (Case 1.2.3).

Case 1.2.1: (x₃ = x₁)

$$\begin{array}{@{}rcl@{}}\lvert\{(x_3,x^{2}) | x_3 = x_1 \neq x_2 \}\rvert = (N-1)(N-2). \end{array} $$

By following the same argument as in Case 1.1.2, we get

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) &= {N-6\over (N-1)(N-2)(N-3)(N-5)}. \end{array} $$

Case 1.2.2:(x₃ = x₂) This subcase is same as the previous case, i.e., the Case 1.2.1.Case 1.2.3: (x₃∉{x₁,x₂}) Here, we have the following subcases to consider depending on (i) x₃ = x₁ + x₂ (Case 1.2.3.a) and (ii) x₃≠x₁ + x₂ (Case 1.2.3.b).

Case 1.2.3.a:(x₃ = x₁ + x₂).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_3,x^{2})| x_3 , x_1, x_2 \ \text{unequal}, x_3 = x_1+x_2 \}\rvert = (N-1)(N-2). \end{array} $$

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) = &\mathbf{P}(\mathsf{V}^{6}=v^{6}|\mathsf{V}_{1}+ \mathsf{V}_{2} = x_1,\mathsf{V}_{3}+ \mathsf{V}_{4} = x_2, \mathsf{V}_{5}+\mathsf{V}_{6}=x_1+\\ &x_2). \end{array} \end{array} $$

Following an analysis similar to Case 1.1.2 we obtain

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) &= {N-8\over (N-1)(N-2)(N-3)(N-5)}. \end{array} $$

Case 1.2.3.b:(x₃≠x₁ + x₂).

$$ \begin{array}{@{}rcl@{}}\lvert\{(x_3,x^{2})| x_3 , x_1, x_2 \ \text{unequal}, x_3 \neq x_1+x_2 \}\rvert = (N-1)(N-2)(N-4). \end{array} $$

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) = &\mathbf{P}(\mathsf{V}^{6}=v^{6}|\mathsf{V}_{1}+ \mathsf{V}_{2} = x_1,\mathsf{V}_{3}+ \mathsf{V}_{4} = x_2, \mathsf{V}_{5}+\mathsf{V}_{6}\neq x_1+\\ &x_2). \end{array} \end{array} $$

In this case also, we follow an analysis similar to Case 1.1.2 to obtain

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2}) &= {N(N-8)(N-8)+ 4N(N-6) \over N(N-1)(N-2)(N-3)(N-4)(N-5)}\\ & = {N^{2}-12N + 40 \over (N-1)(N-2)(N-3)(N-4)(N-5)}. \end{array} \end{array} $$

By adding the cases 1.2.1, 1.2.2, 1.2.3.a, and 1.2.3.b, we obtain

$$\begin{array}{@{}rcl@{}} \sum\limits_{{x^{2}\atop{x_1 \neq x_2}}} \sum\limits_{x_3 \in [N]^*} {\mathbf{P}(\mathsf{X}_{3}=x_3,\mathsf{X}^{2}=x^{2})^{2} \over \mathbf{P}(\mathsf{X}^{2}=x^{2})} &= &{2(N-6)^{2}\over (N-3)(N-4)(N-5)^{2}} \\ &&+ {(N-8)^{2} \over (N-3)(N-4)(N-5)^{2}}\\ &&+{(N^{2}-12N + 40)^{2}\over (N-3)(N-4)^{2}(N-5)^{2}}. \end{array} $$

(29)

Finally, by adding (211) and (212), we get the following expression for the l.h.s. of (25).

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{X}^{2}}{\mathbf{Ex}}\left[\sum\limits_{x_3 \in [N]^*}\mathbf{P}(\mathsf{X}_{3}=x_3|\mathsf{X}^{2}=x^{2})^{2}\right] &=& {1 \over (N-3)(N-5)^{2}}\\&&+ {(N-6)^{2} \over (N-2)(N-3)(N-5)^{2}}\\&&+ {2(N-6)^{2}\over (N-3)(N-4)(N-5)^{2}}\\&&+ {(N-8)^{2} \over (N-3)(N-4)(N-5)^{2}}\\&&+ {(N^{2}-12N + 40)^{2}\over (N-3)(N-4)^{2}(N-5)^{2}}\\ &=& {\texttt{NUM} \over \texttt{DEN}}, \end{array} $$

(30)

where = N⁵ − 22N⁴ + 195N³ − 870N² + 1936N − 1568 and = N⁶ − 23N⁵ + 217N⁴ − 1073N³ + 2926N² − 4160N + 2400.

R.H.S. of (16): Next, we obtain the expression for the r.h.s. of (25). As in the case of l.h.s., we expand the sum under expectation as follows.

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \underset{\mathsf{V}^{4}}{\mathbf{Ex}}\left[\sum\limits_{x_3 \in [N]^*}\mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3 | \mathsf{V}^{4})^{2}\right] &=\sum\limits_{{v^{4}}}\sum\limits_{x_3 \in [N]^*}{\mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3 | \mathsf{V}^{4})^{2} \over \mathbf{P}(\mathsf{V}^{4}=v^{4})} \end{array} \end{array} $$

(31)

Clearly, we have

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}^{4}= v^{4}) = {1 \over N(N-1)(N-2)(N-3)}. \end{array} $$

Similar to Case 1, we split the outer sum in the r.h.s. of (31) in two subcases: (i) in case 2.1 we consider the condition v₁ + v₂ = v₃ + v₄, and (ii) in Case 2.2, we consider the condition v₁ + v₂≠v₃ + v₄. For each of these subcases and subcases of these subcases we determine the number of tuples (x₃,v⁴) and the probability P(V₅ +V₆ = x₃,V⁴ = v⁴) of a particular tuple satisfying the conditions of the subcase.Case 2.1:(v₁ + v₂ = v₃ + v₄). We further divide this subcase according to the conditions (i) x₃ = v₁ + v₂ = v₃ + v₄ (Case 2.1.1), and (ii) x₃≠v₁ + v₂ = v₃ + v₄ (Case 2.1.2).

Case 2.1.1:(x₃ = v₁ + v₂ = v₃ + v₄).

$$\begin{array}{@{}rcl@{}}\lvert\{(x_3, v^{4})| x_3 = v_1+v_2=v_3+v_4 \}\rvert = N(N-1)(N-2). \end{array} $$

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) &=& \sum\limits_{{(v_5, v_6),v_5 \neq v_6, v_5+ v_6 =x_3 \atop \{v_5, v_6\} \cap \{v_1,v_2, v_3,v_4\} = \emptyset}} (\mathbf{P}(\mathsf{V}_{5}=v_5, \mathsf{V}_{6}=v_6| \mathsf{V}^{4}=v^{4})\\ &&\times \mathbf{P}(\mathsf{V}^{4}=v^{4}))\\ &=& \sum\limits_{{(v_5, v_6),v_5 \neq v_6, v_5+ v_6 =x_3 \atop \{v_5, v_6\} \cap \{v_1,v_2, v_3, v_4\} = \emptyset}} {\mathbf{P}(\mathsf{V}_{5}=v_5, \mathsf{V}_{6}=v_6| \mathsf{V}^{4}=v^{4})\over N(N-1)(N-2)(N-3)}. \end{array} $$

Clearly, we have

$$\begin{array}{@{}rcl@{}} \lvert \{(v_5, v_6)|v_5 \neq v_6, v_5+ v_6 =x_3, \{v_5, v_6\} \cap \{v_1,v_2, v_3, v_4\} = \emptyset\} \rvert = N-4. \end{array} $$

And for fixed (v₅,v₆) and v⁴,

$$\begin{array}{@{}rcl@{}} \mathbf{P}((\mathsf{V}_{5}=v_5, \mathsf{V}_{6}=v_6)| \mathsf{V}^{4}=v^{4})& = {1 \over (N-4)(N-5)}. \end{array} $$

So,

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) & = {1 \over N(N-1)(N-2)(N-3)(N-5)}. \end{array} $$

Case 2.1.2:(x₃≠v₁ + v₂ = v₃ + v₄). By following an argument similar to the Case 1.1.2 we arrive at the following two subcases depending on the conditions (i) v₁ ∈{v₃ + x₃,v₄ + x₃} (Case 2.1.2.a), and (ii) v₁∉{v₃ + x₃,v₄ + x₃} (Case 2.1.2.b).Case 2.1.2.a:(v₁ ∈{v₃ + x₃,v₄ + x₃}).

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \lvert \{(x_3, v^{4})|x_3\neq v_1+v_2, x_3\in \{v_1+v_3 , v_1+v_4\}\}\rvert &= 2N(N-1)(N-2). \end{array} \end{array} $$

The number of possible choices of the pair (v₅,v₆) is N − 4. Therefore,

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) &= {1\over N(N-1)(N-2)(N-3)(N-5)}. \end{array} $$

Case 2.1.2.b:(v₁∉{v₃ + x₃,v₄ + x₃}).

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \lvert \{(x_3, v^{4})|x_3\notin \{v_1+v_2, v_1+v_3, v_1+v_4\}\}\rvert &=N(N-1)(N-2)(N-4). \end{array} \end{array} $$

Also, the number of possible choices of the pair (v₅,v₆) is N − 8. So, in this case we have

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) = {(N-8)\over N(N-1)(N-2)(N-3)(N-4)(N-5)}. \end{array} $$

So, summing up the cases 2.1.1, 2.1.2.a and 2.1.2.b, we get

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \sum\limits_{{v^{4} \atop v_1+v_2=v_3+v_4}}\sum\limits_{x_3 \in [N]^*}{\mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4})^{2} \over \mathbf{P}(\mathsf{V}^{4} = v^{4})} =&{1\over (N-3)(N-5)^{2}}\\ &+{2\over (N-3)(N-5)^{2}}\\ &+{(N-8)^{2}\over (N-3)(N-4)(N-5)^{2}}. \end{array} \end{array} $$

(32)

Case 2.2:(v₁ + v₂≠v₃ + v₄) We split this case according to the conditions (i) x₃ ∈{v₁ + v₂,v₃ + v₄} (Case 2.2.1), (ii) x_{i
i} = v₁ + v₂ + v₃ + v₄ (Case 2.2.2), and (iii) x₃∉{v₁ + v₂,v₃ + v₄,v₁ + v₂ + v₃ + v₄} (Case 2.2.3).Case 2.2.1:(x₃ ∈{v₁ + v₂,v₃ + v₄})

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \lvert \{(x_3, v^{4})|x_3 \in \{v_1+v_2, v_3+v_4\}\}\rvert &= 2N(N-1)(N-2)(N-4). \end{array} \end{array} $$

In this case, possible number of choices for the pair (v₅,v₆) is N − 6. So, we have

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) &= {N-6\over N(N-1)(N-2)(N-3)(N-4)(N-5)}. \end{array} $$

Case 2.2.2:(x₃ = v₁ + v₂ + v₃ + v₄)

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \lvert \{(x_3, v^{4})|x_3 = v_1+v_2 +v_3+v_4\}\}\rvert &=N(N-1)(N-2)(N-4). \end{array} \end{array} $$

The number of possible choices for the pair (v₅,v₆) is N − 8. Hence,

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) &= {N-8\over N(N-1)(N-2)(N-3)(N-4)(N-5)}. \end{array} $$

Case 2.2.3:(x₃∉{v₁ + v₂,v₃ + v₄,v₁ + v₂ + v₃ + v₄}) We further divide this case into the following two subcases depending on (i) v₁ ∈{v₂ + x₃,v₄ + x₃} (Case 2.2.3.a), and (ii) v₁∉{v₃ + x₃,v₄ + x₃} (Case 2.2.3.b).Case 2.2.3.a:(v₁ ∈{v₃ + x₃,v₄ + x₃}).

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \lvert \{(x_3, v^{4})|x_3 \in \{v_3+v_1, v_4+v_1\}\}\}\rvert &= 4N(N-1)(N-2)(N-4). \end{array} \end{array} $$

Possible number of choices for the pair (v₄,v₅) is N − 6. Therefore,

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) &= {N-6\over N(N-1)(N-2)(N-3)(N-5)}. \end{array} $$

Case 2.2.3.b: (v₁∉{v₃ + x₃,v₄ + x₃}).

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \lvert \{(x_3, v^{4})|x_3 \notin \{v_3+v_1, v_4+v_1\}\}\}\rvert &=N(N-1)(N-2)(N-4)(N-8). \end{array} \end{array} $$

The number of possible choices of the pair (v₅,v₆) is N − 8. So, in this case, we have

$$\begin{array}{@{}rcl@{}} \mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4}) &= {N-8\over N(N-1)(N-2)(N-3)(N-4)(N-5)}. \end{array} $$

So, summing up the subcases 2.2.1, 2.2.2, 2.2.3.a, and 2.2.3.b, we get

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} \sum\limits_{{v^{4}\atop v_1+v_2 \neq v_3+v_4}}\sum\limits_{x_3 \in [N]^*}{\mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3, \mathsf{V}^{4} = v^{4})^{2} \over \mathbf{P}(\mathsf{V}^{4} = v^{4})} =&{2(N-6)^{2} \over (N-3)(N-4)(N-5)^{2}}\\ &+{(N-8)^{2}\over (N-3)(N-4)(N-5)^{2}}\\ &+{4(N-6)^{2} \over (N-3)(N-4)(N-5)^{2}}\\ &+{(N-8)^{3}\over (N-3)(N-4)(N-5)^{2}}. \end{array} \end{array} $$

(33)

Finally, adding (32) and (33) we get

$$\begin{array}{@{}rcl@{}} \underset{\mathsf{V}^{4}}{\mathbf{Ex}}[\sum\limits_{x_3 \in [N]^*}\mathbf{P}(\mathsf{V}_{5}+\mathsf{V}_{6}=x_3 | \mathsf{V}^{4})^{2}]= {(N^{2} - 11N + 36)\over (N-3)(N-4)(N-5)}. \end{array} $$

(34)

The theorem follows by comparing (21) and (22) (difference between the two being $16(N^{2} - 8N + 8)\over (N-2)(N-3)(N-4)^{2}(N-5)^{2}$ which is > 0 for N ≥ 8).

6 Concluding remarks

In the concluding part of their paper [7], the authors consider XOR of two independent random permutations construction which is a variant of the XOR of two random permutations construction. Given two independent random permutations π₁,π₂ of {0,1}ⁿ, the construction creates a function g : {0,1}ⁿ↦{0,1}ⁿ as g(x) = π₁(x) ⊕ π₂(x). The authors show full security of the construction as well. Steps of the proof are quite similar to the previous one. We mention that the same gap exists (in the same step as (8) in this paper) in this proof as well, which can be demonstrated in similar manner as our proofs in this note. To avoid repetition we do not show this part.

Notes

This line of work was initiated by Bellare et al. in [2] who coined the term “Luby-Rackoff backwards” for such conversion.
A quote from the paper [7] “Patarin's tight proof is very involved, with some claims remaining open or unproved.”
In fact, in this setting, i.e, for information theoretic security, there always exists an adversary $\mathcal {A}^{\prime }$ such that $ \mathbf {Adv}^{\text {prf}}_{f}(\mathcal {A}^{\prime }) = d_{\text {TV}}(\mathbf {P}_{\mathbf {1}}, \mathbf {P}_{\mathbf {0}})$; $\mathcal {A}^{\prime }$ returns 1 for any $x^{q} \in \mathcal {E}^{\prime }$, where $\mathcal {E}^{\prime }$ is such that $d_{\text {TV}}(\mathbf {P}_{\mathbf {1}}, \mathbf {P}_{\mathbf {0}})= \mathbf {P}_{\mathbf {0}}({\mathcal {E}}^{\prime }) - \mathbf {P}_{\mathbf {1}}({\mathcal {E}}^{\prime })$.
Triangle inequality of total variation metric can be easily shown from the triangle inequality in real numbers.
Which has been shown later in the proof given by Dai et al. In this paper we don’t provide details on this claim and so we skip this proof here.
Let a₁,…,a_nand b₁,…,b_nbe nonnegative numbers. We denote the sum $\sum _{i} a_{i}$and $\sum _{i} b_{i}$by a and b respectively. The log sum inequality states that $\sum _{{i = 1}}^{n}a_{i}\log \frac {a_{i}}{b_{i}} \geq a\log \frac {a}{b}$.

References

Bellare, M., Impagliazzo, R.: A tool for obtaining tighter security analyses of pseudorandom function based constructions, with applications to PRP to PRF conversion. IACR Cryptol. ePrint Arch. 1999, 24 (1999)
Google Scholar
Bellare, M., Krovetz, T., Rogaway, P.: Luby-Rackoff backwards: Increasing security by making block ciphers non-invertible, pp 266–280. Springer, Berlin (1998)
MATH Google Scholar
Black, J., Rogaway, P.: A block-cipher mode of operation for parallelizable message authentication. In: EUROCRYPT 2002, volume 2332 of LNCS, pp 384–397. Springer (2002)
Cogliati, B., Lampe, R., Patarin, J.: The Indistinguishability of the XOR of k Permutations. In: Fast Software Encryption - 21st International Workshop, FSE 2014, London, UK, March 3-5, 2014. Revised Selected Papers, edited by C. Cid and C. Rechberger, volume 8540 of Lecture Notes in Computer Science, pp 285–302. Springer (2014)
Cogliati, B., Seurin, Y.: EWCDM: An Efficient, Beyond-Birthday Secure, Nonce-Misuse Resistant MAC. In: CRYPTO 2016, Proceedings, Part I, pp 121–149 (2016)
Cover, T.M., Thomas, J.A.: Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing), Wiley-Interscience (2006)
Dai, W., Hoang, V.T., Tessaro, S.: Information-theoretic Indistinguishability via the Chi-squared Method, Cryptology ePrint Archive, Report 2017/537. http://eprint.iacr.org/2017/537 (2017)
Gilboa, S., Gueron, S.: Distinguishing a truncated random permutation from a random function, IACR Cryptology ePrint Archive 2015, 773 (2015)
Gilboa, S., Gueron, S.: The advantage of truncated permutations, CoRR arXiv:1610.02518 (2016)
Gilboa, S., Gueron, S., Morris, B.: How Many Queries are Needed to Distinguish a Truncated Random Permutation from a Random Function? Journal of Cryptology (2017)
Gueron, S., Langley, A., Lindell, Y: AES-GCM-SIV: Specification and Analysis, IACR Cryptology ePrint Archive 2017, 168 (2017)
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)
Article MATH Google Scholar
Hall, C., Wagner, D., Kelsey, J., Schneier, B.: Building PRFs from PRPs, pp 370–389. Springer, Berlin (1998)
MATH Google Scholar
Iwata, T., Minematsu, K., Peyrin, T., Seurin, Y.: ZMAC: A fast tweakable block cipher mode for highly secure message authentication, IACR Cryptology ePrint Archive 2017, 535 (2017)
Iwata, T.: New blockcipher modes of operation with beyond the birthday bound security. In: Fast Software Encryption, 13th International Workshop, FSE 2006, Graz, Austria, March 15-17, 2006, Revised Selected Papers, edited by M. J. B. Robshaw, volume 4047 of Lecture Notes in Computer Science, pp 310–327. Springer (2006)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Statist. 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Lucks, S.: The sum of PRPs is a secure PRF. In: EUROCRYPT 2000, volume 1807 of LNCS, pp 470–484. Springer (2000)
Liese, F., Vajda, I.: Convex statistical distances. Teubner, Leipzig (1987)
MATH Google Scholar
Mennink, B., Neves, S.: Encrypted Davies-Meyer and its dual: Towards optimal security using Mirror theory, Cryptology ePrint Archive, Report 2017/xxx, to be published in CRYPTO 2017. http://eprint.iacr.org/2017/537 (2017)
Patarin, J.: The “Coefficients H” Technique. In: Selected Areas in Cryptography, 2008, volume 5381 of LNCS, pp 328-345. Springer (2008)
Patarin, J.. In: ICITS 2008, volume 5155 of LNCS, pp 232–248. Springer (2008)
Patarin, J.: Introduction to Mirror Theory: Analysis of Systems of Linear Equalities and Linear Non Equalities for Cryptography., Cryptology ePrint Archive, Report 2017/287. http://eprint.iacr.org/2010/287 (2010)
Reiss, R.-D.: Approximate distributions of order statistics: with applications to nonparametric statistics. Springer Science & Business Media, Berlin (2012)
MATH Google Scholar
Slivkins, A.: Lecture Notes CMSC 858G: Bandits, Experts and Games (Lecture 3). http://www.cs.umd.edu/slivkins/CMSC858G-fall16/Lecture3.pdf (2016)
Stam, A.J.: Distance between sampling with and without replacement. Statistica Neerlandica 32(2), 81–91 (1978)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

We thank the reviewers for their comments and suggestions which have improved the quality of our manuscript.

Author information

Authors and Affiliations

Indian Statistical Institute, Kolkata, India
Srimanta Bhattacharya & Mridul Nandi

Authors

Srimanta Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Mridul Nandi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Srimanta Bhattacharya.

Additional information

This article is part of the Topical Collection on Special Issue on Statistics in Design and Analysis of Symmetric Ciphers

Appendix A: Proof of the χ 2 method

In this section we provide proof of Theorem 1, which is the heart of the χ² method. The proof is based on Lemma 1, Lemma 2, and Theorem 6. Along the way we also briefly mention some (relevant) facts of KL divergence and χ² distance.

Kullback-Leibler Divergence. Kullback-Leibler divergence (KL divergence) or relative entropy between P₀ to P₁ is defined as

$${\displaystyle d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}})= \sum\limits_{X \in \mathrm{\Omega}}\mathbf{P}_{\mathbf{0}}(X) \log {\frac{\mathbf{P}_{\mathbf{0}}(X)}{\mathbf{P}_{\mathbf{1}}(X)}}.}$$

Note that the KL divergence is defined only if P₀ ≪P₁ (with the convention that $0 \log {0\over 0}= 0$). It was first defined by Kullback and Leibler in 1951 [16] as a generalization of the entropy notion of Shannon (see [6]).

It can be shown that the KL divergence between any two distributions is always non-negative (known as Gibbs’ inequality, see [6]). However, it is not symmetric (i.e., d_KL(P₀,P₁)≠d_KL(P₀,P₁) in general) and does not satisfy the triangle inequality. Thus, KL divergence is not a metric.

Though not a metric, KL divergence has some useful properties. For example, the KL divergence between any two product distributions is additive over the corresponding marginals (see [6, 23]). The KL divergence between two joint distribution can be obtained as the sum of the KL divergences of corresponding conditional distributions. This is known as the chain rule of KL divergence. It is one of the crucial parts of the χ² method. We elaborate it in more detail below.

Chain rule of KL divergence. Let $ \mathbf {P}_{\mathbf {0}}^{\textit {\textbf {q}}}$ and $\mathbf {P}_{\mathbf {1}}^{\textit {\textbf {q}}}$ be two probability distributions over Ω^q. We denote $\mathbf {P}_{\mathbf {0}}^{ {i}}$ and $\mathbf {P}_{\mathbf {1}}^{ {i}}$ to represent the marginal probability distributions for first i coordinates of $\mathbf {P}_{\mathbf {0}}^{\textit {\textbf {q}}}$ and $\mathbf {P}_{\mathbf {1}}^{\textit {\textbf {q}}}$ respectively, 1 ≤ i ≤ q. In other words, if X := (X₁,…,X_q) and Y := (Y₁,…,Y_q) are two joint random variables following the probability distributions $\mathbf {P}_{\mathbf {0}}^{q}$ and $\mathbf {P}_{\mathbf {1}}^{q}$ then $\textbf {P}^{i}_{0}$ and $\textbf {P}^{i}_{1}$ represent the probability distributions of Xⁱ and Yⁱ respectively. We recall that P₀(x_i) denotes the conditional distribution $\textbf {P}(\mathsf {X}_{i} = x_{i}|\mathsf {X}^{i-1} = x^{i-1})$ and similarly $\mathbf {P}_{\mathbf {1}|x^{i-1}}(x_{i})$. Moreover, $\text {KL}(x^{i-1}) = d_{\text {KL}}(\mathbf {P}_{\mathbf {0}|x^{i-1}}, \mathbf {P}_{\mathbf {1}|x^{i-1}})$. Now we state chain rule of KL divergence.

Lemma 1 (Chain rule of KL divergence (see [6], Theorem 2.5.3))

Following the above notations,

$$d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}^{\textbf{q}}, \mathbf{P}_{\mathbf{1}}^{\textbf{q}}) = d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}^{\mathbf{1}}, \mathbf{P}_{\mathbf{1}}^{\mathbf{q}}) + \sum\limits_{i = 2}^{q} \mathbf{Ex}[\text{KL}(\mathsf{X}^{i-1})].$$

Proof

$$\begin{array}{@{}rcl@{}} \begin{array}{ll} d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}, \mathbf{P}_{\mathbf{1}}^{\textit{\textbf{q}}}) & = \sum\limits_{x^{q}\in\mathrm{\Omega}^{q}} \mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}(x^{q}) \log \left( {\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}(x^{q}) \over \mathbf{P}_{\mathbf{1}}^{\textit{\textbf{q}}}(x^{q})} \right)\\ &= \sum\limits_{x^{q}\in\mathrm{\Omega}^{q}} \mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}(x^{q})\log \left( {\prod_{i = 1}^{q}\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i} ) \over \prod_{i = 1}^{q}\mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i} )} \right)\\ &= \sum\limits_{x^{q}\in\mathrm{\Omega}^{q}} \mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}(x^{q})\sum\limits_{i = 1}^{q}\log \left( {\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i} ) \over \mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i} )} \right)\\ &= \sum\limits_{i = 1}^{q}\sum\limits_{x^{q}\in\mathrm{\Omega}^{q}}\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}(x^{q})\log \left( {\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i} ) \over \mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i} )} \right)\\ &= \sum\limits_{i = 1}^{q}\sum\limits_{x^{i}\in\mathrm{\Omega}^{i}}\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{i}}}(x^{i})\log \left( {\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i} ) \over \mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i} )} \right)\\ &= \sum\limits_{i = 1}^{q}\sum\limits_{x^{i}\in\mathrm{\Omega}^{i}}\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{i-}}1}(x^{\textit{\textbf{i-}}1})\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i})\log \left( {\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i} ) \over \mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i} )} \right)\\ &= \sum\limits_{i = 1}^{q}\sum\limits_{x^{i-1}\in\mathrm{\Omega}^{i-1}}\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{i-}}1}(x^{\textit{\textbf{i-}}1})\sum\limits_{X_{i}}\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i})\log \left( {\mathbf{P}_{\mathbf{0}|x^{i-1}}(x_{i} ) \over \mathbf{P}_{\mathbf{1}|x^{i-1}}(x_{i} )} \right)\\ &= \sum\limits_{i = 1}^{q}\sum\limits_{x^{i-1}\in\mathrm{\Omega}^{i-1}}\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{i-}}1}(x^{i-1})\text{KL}(x^{i-1})\\ &=\sum\limits_{i = 1}^{q} \mathbf{Ex}[\text{KL}(\mathsf{X}^{i-1})] \end{array} \end{array} $$

The next inequality due to Pinsker (see [6]) gives an upper bound on the total variation distance between two distributions in terms of their KL divergence.

Theorem 6 (Pinsker’s Inequality)

For every probability functionsP₀,P₁,

$$d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) \leq \sqrt{\frac{1}{2} d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}})}.$$

Proof

We follow the steps of [24]. Let Ω^′ = {x ∈Ω|P₀(x) ≥P₁(x)}.Also, let $p_{i} = \sum _{x \in {\mathrm {\Omega }^{\prime }}} \mathbf {\textit {P}}_{\mathbf {\textit {i}}}(x)$for i ∈{0, 1}. So,d_TV(P₀,P₁) = p₀ − p₁. Also, by logsuminequality^{Footnote 6},we have $d_{\text {KL}}(\mathbf {P}_{\mathbf {0}}, \mathbf {P}_{\mathbf {1}}) \geq p_{0} \log {p_{0} \over p_{1}}+ (1-p_{0}) \log {(1-p_{0}) \over (1-p_{1})}$.Therefore,

$$\begin{array}{@{}rcl@{}} d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) & \geq& p_{0} \log{p_{0} \over p_{1}}+ (1-p_{0}) \log {(1-p_{0}) \over (1-p_{1})}\\ &=& \int_{p_{1}}^{p_{0}}\left( {p_{0} \over x} - {(1-p_{0}) \over (1-x)}\right) dx\\ &=& \int_{p_{1}}^{p_{0}}{p_{0}-x \over x(1-x)} dx\\ &\geq& 2(p_{0} - p_{1})^{2} = 2d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}})^{2}, \left( \text{since} \ x(1-x) \leq {1 \over 4}\right). \end{array} $$

χ²distance. χ² distance has its origin in mathematical statistics dating back to Pearson (see [18] for some history). The χ² distance between P₀ and P₁, with P₀ ≪P₁, is defined as

$$d_{\chi^{2}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) := \sum\limits_{x \in \Omega} \frac{(\mathbf{P}_{\mathbf{0}}(x) - \mathbf{P}_{\mathbf{1}}(x))^{2}}{\mathbf{P}_{\mathbf{1}}(x)}.$$

It can be seen that χ² distance is not symmetric. Therefore, it is not a metric. However, like KL-divergence, χ² distance between product distributions can be bounded in terms of the χ² distances between their marginals (see [23]). The following lemma shows that KL-divergence between two distributions can be upper bounded by their χ² distance. The first inequality can also be found in earlier works (see [12] for this and many other relations among various distances used in Statistics).

Lemma 2

$d_{\text {KL}}(\mathbf {P}_{\mathbf {0}}, \mathbf {P}_{\mathbf {1}}) \leq \log (1 + d_{\chi ^{2}}(\mathbf {P}_{\mathbf {0}}, \mathbf {P}_{\mathbf {1}})) \leq d_{\chi ^{2}}(\mathbf {P}_{\mathbf {0}}, \mathbf {P}_{\mathbf {1}})$ .

Proof

By the definition of χ²distance we have

$$\begin{array}{@{}rcl@{}} \log (1 + d_{\chi^{2}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}})) & =& \log \left( \sum\limits_{x \in \mathrm{\Omega}} \mathbf{P}_{\mathbf{0}}(x) {\mathbf{P}_{\mathbf{0}}(x) \over \mathbf{P}_{\mathbf{1}}(x)}\right)\\ & =& \log \left( {\mathbf{Ex}}\left[{\mathbf{P}_{\mathbf{0}}(x) \over \mathbf{P}_{\mathbf{1}}(x)}\right]\right)\\ & \geq& \mathbf{Ex}\left[\log \left( {\mathbf{P}_{\mathbf{0}}(x) \over \mathbf{P}_{\mathbf{1}}(x)}\right) \right] \text{by Jensen's inequality}\\ & =& \sum\limits_{x \in \mathrm{\Omega}} \mathbf{P}_{\mathbf{0}}(x) \log \left( {\mathbf{P}_{\mathbf{0}}(x) \over \mathbf{P}_{\mathbf{1}}(x)}\right)\\ & =& d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}, \mathbf{P}_{\mathbf{1}}) \end{array} $$

The last inequality follows by observing that $d_{\chi ^{2}}(\mathbf {P}_{\mathbf {0}}, \mathbf {P}_{\mathbf {1}})) \geq 0$and log(1 + t) ≤ t fort ≥ 0.

1.1 A.1 Proof of Theorem 1

We are now ready to show the upper bound on $d_{\text {TV}}(\mathbf {P}_{\mathbf {0}}^{\textit {\textbf {q}}}, \mathbf {P}_{\mathbf {1}}^{\textit {\textbf {q}}})$ in terms of expected value of χ² distance between the conditional distributions P₀ and P₁. We state and prove the χ² method, i.e. Theorem 1.

Proof

of Theorem 1 The proof follows directly from Pinsker’s inequality (Theorem 6), chain rule of KLdivergence (Lemma 1), and Lemma 2. More precisely, we have

$$\begin{array}{@{}rcl@{}} d_{\text{TV}}(\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}, \mathbf{P}_{\mathbf{1}}^{\textit{\textbf{q}}}) &\leq& \left( {d_{\text{KL}}(\mathbf{P}_{\mathbf{0}}^{\textit{\textbf{q}}}, \mathbf{P}_{\mathbf{1}}^{\textit{\textbf{q}}}) \over 2}\right)^{1\over 2} \text{by Theorem 6}\\ &=& \left( {1\over 2}\sum\limits_{i = 1}^{q} \mathbf{Ex}[\text{KL}(\mathsf{X}^{i-1})]\right)^{1\over 2} \text{by Lemma 1} \\ &\leq& \left( {1\over 2}\sum\limits_{i = 1}^{q} \mathbf{Ex}[\chi^{2}(\mathsf{X}^{i-1})]\right)^{1\over 2} \text{by Lemma 2} \end{array} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhattacharya, S., Nandi, M. A note on the chi-square method: A tool for proving cryptographic security. Cryptogr. Commun. 10, 935–957 (2018). https://doi.org/10.1007/s12095-017-0276-z

Download citation

Received: 31 July 2017
Accepted: 10 December 2017
Published: 16 January 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s12095-017-0276-z

A note on the chi-square method: A tool for proving cryptographic security

Abstract

Similar content being viewed by others

Full Indifferentiable Security of the Xor of Two or More Random Permutations Using the $$\chi ^2$$ Method

Populating the Zoo of Rugged Pseudorandom Permutations

Keyed Sum of Permutations: A Simpler RP-Based PRF

1 Introduction

Definition 1

1.1 Main results in the paper

1.2 Organization of the paper

2 Preliminaries

Notation and Convention

2.1 PRF-security definition

Definition 2 (PRF-advantage)

2.2 χ 2 method

Theorem 1 ([7])

2.3 Two random permutation based constructions

Theorem 2 ([25])

Corollary 1

Remark 1

3 Proof of theorem 2 using the χ 2 method

Fact 1

Remark 2

4 Overview of the proof by Dai et al. and its flaw

Theorem 3 ([7])

Proof due to Dai et al. in [7]

4.1 Flaw in the above proof and its repair under an assumption

Assumption 1

5 On the correctness of Assumption 1

Theorem 4

Proof

Theorem 5

Proof

6 Concluding remarks

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix A: Proof of the χ 2 method

Appendix A: Proof of the χ 2 method

Lemma 1 (Chain rule of KL divergence (see [6], Theorem 2.5.3))

Proof

Theorem 6 (Pinsker’s Inequality)

Proof

Lemma 2

Proof

1.1 A.1 Proof of Theorem 1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation

2.2 χ ² method

3 Proof of theorem 2 using the χ ² method

Appendix A: Proof of the χ ² method