Keywords

1 Introduction

Communication complexity is a central model of computation, first defined by Yao, [22], that has found applications in many areas of theoretical computer science. In the 2-party communication complexity setting, we consider two players, Alice and Bob with unlimited computational power. Each of them receives an input, say \(x\in \mathcal {X}\) for Alice and \(y\in \mathcal {Y}\) for Bob, and their goal is to compute \(f(x,y)\in \mathcal {Z}\) for some fixed function \(f\) with the minimum amount of communication.

Imagine now that Alice and Bob still want to collaboratively compute \(f(x,y)\), while retaining privacy of their input. The loss of privacy measures how much information about \((x,y)\) is leaked to an eavesdropper who has only access to the transcript (external privacy), or how much information about one party’s input is leaked through the transcript to the other pary (internal privacy). A perfectly private protocol will reveal no information about \(x\) and \(y\), other than what can be inferred from the value of \(f(x,y)\).

For example, if Alice and Bob both want to output the minimum of \(x, y \in \{0,1\}^n\) and the identity of the person holding it, then the deterministic communication protocol with optimal communication is the trivial protocol of complexity \(2n\). In fact one can show that any deterministic protocol that has optimal communication is not private at all against an eavesdropper since basically both players have to send the input to the other one. However a perfectly private deterministic protocol exists, alas with much worse communication complexity: the two parties initiate a counter \(i = 0\) and in each round \(i = 0\) to \(2^n-1\), Alice announces “Yes” if \(x=i\), otherwise “No”; Bob announces “Yes” if \(y=i\), otherwise “No”. If neither party says “Yes” then they increment \(i\), otherwise the protocol ends when someone says “Yes”. It is clear that from the transcript, one only learns what can be inferred from the value of the function and nothing more.

In order to quantify privacy, Bar-Yehuda et al. [9] provided a definition of internal privacy of a function \(f\) according to an input distribution \(\mu \), a variation of which has been subsequently referred to as internal information cost (\(\mathrm {IC}^{int}_{\mu }(f)\)). At a high level, it measures the amount of information Alice learns about Bob’s input from the transcript and vice versa. A second type of information cost, called external information cost (\(\mathrm {IC}^{ext}_{\mu }(f)\)) was defined in [12] and measures the amount of information that is learned by an external observer about Alice and Bob’s inputs given the messages they exchanged during the protocol. The notion of internal and external information cost has recently found many important applications in communication complexity, including better communication lower bounds for important functions, direct sum theorems and new compression schemes [2, 3, 58, 12, 20].

Klauck [15] also defined an information theoretic notion of privacy, which we denote here by \(\mathrm {PRIV}_{\mu }^{int}(f)\), which is closely related to the internal information cost (the only difference being that we subtract the information that the function reveals about the inputs, which the players are allowed to learn). In fact, the two notions are basically equivalent for boolean functions and all our results about \(\mathrm {PRIV}\) can be translated to results about information cost. These definitions have the advantage of being easily related to other tools in information theory, but are not easily seen in a combinatorial way.

Feigenbaum et al. [13] gave a combinatorial definition of privacy for the uniform distribution over inputs that was extended by Ada et al. [1] to any distribution \(\mu \), called average case objective privacy-approximation ratio, that we will refer to simply as external privacy-approximation ratio (we only study this average-case notion, and not a related worst-case notion also defined in that work), and we denote this by \(\mathrm {PAR}_{\mu }^{ext}(f,P)\). It is equal to the expected value over the inputs \((x,y)\) drawn from some distribution \(\mu \) of the following ratio: the number of inputs that are mapped to the same value by \(f\) (that are indistinguishable from \((x,y)\) by looking only at the function’s output) over the number of inputs giving rise to the same transcript as the one of \((x,y)\) (that are indistinguishable of \((x,y)\) by looking only at the protocol’s transcript). They also defined (average) subjective (or internal) privacy-approximation ratio (here again we will omit “average”) which we denote \(\mathrm {PAR}_{\mu }^{int}(f,P)\), which captures how much more one player learns about the input of the other one by the transcript than by the value of the function, and equals the ratio of the number of Alice’s possible inputs \(x\) that are indistinguishable by looking only at Bob’s input \(y\) and the output of the function, over the number of \(x\)’s that are instiguishable by looking at \(y\) and the full transcript plus the symmetric ratio for Bob. Last, they computed lower bounds for the privacy-approximation ratio of several functions, however restricting themselves to the case of uniformly distributed inputs.

More recently, Ada et al. in [1] have modified the definition of privacy-approximation ratio, which we denote as \(\mathrm {PAR}_{\mu }^{ext}(f)\) and \(\mathrm {PAR}_{\mu }^{int}(f)\), so that it measures the size of subsets of \(\mathcal {X}\times \mathcal {Y}\) not just by counting the number of elements, but relative to the inputs’ distribution \(\mu \). They showed that the logarithm of this new definition of internal \(\mathrm {PAR}\) can be lower bounded by the zero-error internal information cost (which nevertheless can be arbitrarily smaller for certain functions with large output range). They also proved a tradeoff between privacy and communication complexity for a specific function (Vickrey-auction) and the uniform distribution of inputs. We note that in [13] and [1] only deterministic protocols were considered. Moreover, the relation between \(\mathrm {PRIV}\) and \(\mathrm {PAR}\) was not very well understood.

Our Results: We prove new relationships between \(\mathrm {PRIV}\), \(\mathrm {PAR}\) and communication complexity, as well as providing new lower bound techniques for the two notions of privacy, \(\mathrm {PRIV}\) and \(\mathrm {PAR}\), both external and internal, enabling us to give tight bounds for the privacy of various functions in the case of deterministic protocols. We also extend the definitions of \(\mathrm {PRIV}\) and \(\mathrm {PAR}\) to bounded-error randomized protocols, and derive linear lower bounds for various functions.

New lower bounds for external \(\mathrm {PAR}\) of deterministic protocols for boolean functions: For boolean functions we give new lower bounds techniques, relating it to the rank of the function and the deterministic complexity.

Theorem 1

For boolean \(f\), for any distribution \(\mu \) with full support, \(\mathrm {PAR}_{\mu }^{ext}(f) \ge \mathrm {rank}\left( \mathcal {M}_{f}\right) -1\).

Theorem 2

For boolean \(f\), for any distribution \(\mu \) with full support, \(\log \mathrm {PAR}_{\mu }^{ext}(f) \ge \sqrt{\mathbf {D}(f)}\).

Observe that this implies that \(\log \mathrm {PAR}_{\mu }^{ext}(f)\) is in fact polynomially related to the deterministic communication complexity. Notably, it therefore holds that the only boolean functions with low privacy loss (as measured using \(\mathrm {PAR}^{ext}\)) are functions that have low communication complexity (this is not the case with non-boolean functions as was already observed by [13]).

New lower bounds for external \(\mathrm {PAR}\) of deterministic protocols for non-boolean functions: For simplicity we restrict ourselves to full support distibutions \(\mu \), but it is possible to extend the results to general distributions by considering summations over only the rectangles whose intersection with \(\mu \)’s support is not empty. First, we present a general lower bound technique for \(\mathrm {PAR}_{\mu }^{ext}(f)\) via linear programming. We relate it to two other well known lower bound techniques for communication complexity (see [14]): the rectangle bound (\(\mathrm {rec}(f)\)) and the partition bound (\(\mathrm {prt}(f)\)). This linear program, whose optimal value is denoted by \(\widetilde{{\mathrm {PAR}}}_{\mu }(f)\), can be written as a weighted sum of rectangle bounds \(\mathrm {rec}^{z}(f)\), where the weight is equal to the weight of the inputs \((x,y)\) according to \(\mu \) that are mapped to \(z\) by \(f\). It is, hence, easy to compute for many functions:

Theorem 3

For all \(f\), for any distribution \(\mu \) with full support, \(\mathrm {PAR}_{\mu }^{ext}(f) \ge \widetilde{{\mathrm {PAR}}}_{\mu }(f)\).

Theorem 4

For all \(f\), for any distribution \(\mu \) with full support, \(\widetilde{{\mathrm {PAR}}}_{\mu }(f) \ge \sum _{z \in \mathcal {Z}} \left| {f^{-1}(z)}\right| _{\mu } \cdot \mathrm {rec}^{z}(f)\).

Moreover, we bound external \(\mathrm {PAR}\) as a weighted sum of the size of the \(z\)-fooling sets \(F_z\) of \(\mathcal {M}_{f}\):

Theorem 5

For all \(f\), for any distribution \(\mu \) with full support, \(\mathrm {PAR}_{\mu }^{ext}(f) \ge \sum _z \left| {f^{-1}(z)}\right| _{\mu } \cdot | F_z|\).

New lower bound techniques for external \(\mathrm {IC}\) and \(\mathrm {PRIV}\): We prove a new lower bound on the external zero-error information cost which, using the equivalence between \(\mathrm {IC}\) and \(\mathrm {PRIV}\) given in Theorem 12, will in turn give new lower bounds on \(\mathrm {PRIV}_{\mu }^{ext}(f)\).

Theorem 6

Fix a function \(f\). Suppose there exists \(\delta > 0\) and a distribution \(\mu \) over the inputs of \(f\), such that for all monochromatic rectangles \(R\) of \(f\), \(\mu (R) \le \delta \). Then it holds for every protocol \(P\) that computes \(f\) without error on any input that \(\mathrm {IC}^{ext}_{\mu }(P) \ge \log (1/\delta )\).

We remark that our theorem allows us to prove exact bounds for zero-error \(\mathrm {IC}\) up to an additive constant term (with a small constant, between 1 and 2).

Theorem 7

For each of \(f = \mathrm {EQ}, \mathrm {GT}, \mathrm {DISJ}\), there exists \(\mu \) such that \(\mathrm {IC}^{ext}_{\mu }(f) \ge n\). Also, there exists \(\mu \) such that \(\mathrm {IC}^{ext}_{\mu }(\mathrm {IP}) \ge n - 1 - o(1)\).

These are much sharper than typical lower bounds on \(\mathrm {IC}\), which work in the bounded-error case and incur multiplicative constants [6, 7, 10, 16]. The only other such sharp lower bounds we are aware of are due to Braverman et al. [4] who study the AND and \(\mathrm {DISJ}\) functions. However they prove sharp bounds for the internal \(\mathrm {IC}\) of \(\mathrm {DISJ}\), not for the external \(\mathrm {IC}\) as we study here.

Our bound proves an optimal lower bound on the zero-error information cost of certain functions (i.e. without any additive constant loss). For the one bit \(\mathrm {AND} \), we show that there exists \(\mu \) with \(\mathrm {IC}^{ext}_{\mu }(AND) \ge \log _2 3\). This matches the bound of [4] (they also proved optimality via different techniques).

Privacy for bounded-error randomized protocols: We define for the first time \(\mathrm {PAR}\) and \(\mathrm {PRIV}\) for bounded-error protocols. Such protocols can be much more efficient than deterministic ones and it is important to see whether they remain private or not. These definitions capture again how much more information is leaked by the protocol than by the output of the function, where now we consider randomized protocols that compute the function with some bounded error. We show that for any protocol, \(\mathrm {PRIV}\) is a lower bound on \(\mathrm {PAR}\), both for the external and internal notions.

Theorem 8

\(\forall \mu , f, \epsilon , \; \mathrm {PRIV}_{\mu ,\epsilon }^{ext}(f) \le \log \mathrm {PAR}^{ext}_{\mu ,\epsilon }(f)\) and \(\mathrm {PRIV}_{\mu ,\epsilon }^{int}(f) \le 2\log \mathrm {PAR}^{int}_{\mu ,\epsilon }(f)-2\).

Internal \(\mathrm {PRIV}\) is lower bounded by internal \(\mathrm {IC}\), which was shown in [16] to subsume almost all known lower bounds for communication complexity, i.e. smooth rectangle, \(\gamma _2\)-norm, discrepancy, etc. Hence,

Corollary 1

(Informal) In the bounded error setting, for all boolean \(f\) whose internal information complexity equals communication complexity, all notions of privacy loss (PRIV, PAR, external, internal) are equivalent to each other and to the communication complexity.

Interestingly, \(\mathrm {PAR}\) sits between information and communication complexity, and it is an important open question whether these two notions are equal for all functions (and hence make \(\mathrm {PAR}\) equal to them).

Applications: We exhibit the power of these new lower bound techniques for \(\mathrm {PAR}\) and \(\mathrm {PRIV}\) by proving optimal lower bounds on most of the examples of functions left open in [13] and more: Equality, Disjointness, Inner Product, Greater Than (Millionaire’s problem).

Table 1. Lower bounds for specific functions, zero error.
Table 2. Lower bounds for specific functions, with bounded error

Comparison between the two notions of privacy: For the case of bounded-error protocols, the two notions of privacy seem to be practically equal for most functions. However, for the zero-error case, they can diverge for certain functions. In order to understand the differences between the notions, we study their robustness when we change slightly the input distribution and we show that the information theoretic notion of privacy is more robust to such changes. Moreover, we show that while \(\mathrm {PRIV}\) is always less than the expected communication complexity of the protocol, the same is not true for \(\mathrm {PAR}\). We also discuss an error in the appendix of [13] where they claim that \(\mathrm {PRIV}\) is not as robust as \(\mathrm {PAR}\).

2 Preliminaries

We consider three non empty sets \({\mathcal {X}}, {\mathcal {Y}}, {\mathcal {Z}}\) and a function \({f}: \mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\). \({\mu }\) denotes a distribution over \(\mathcal {X}\times \mathcal {Y}\), and for any set \(E \subseteq \mathcal {X}\times \mathcal {Y}, {\left| {E}\right| _{\mu }} := \sum _{(x,y) \in E} \mu (x,y)\). \({\mathcal {M}_{f}}\) is the matrix of \(f\): \(\mathcal {M}_{f}[x,y] := f(x,y)\). A rectangle of \(\mathcal {X}\times \mathcal {Y}\) is a product set \(A \times B\) where \(A \subseteq \mathcal {X}, B \subseteq \mathcal {Y}\).

We let \(P\) denote a two-party communication protocol. Protocols may use both public and private random coins. We let \(r\) denote the ensemble of all random coins (public and private) a protocol may use; we let \(R\) denote a random variable of all these coins, and \(R_{\mathsf {pub}}\) denote just the public coins. Given a (possibly randomized) protocol \(P\), for any input \((x,y) \in \mathcal {X}\times \mathcal {Y}\) and random coins \(r\), \({P(x,y,r)}\) is the value output by Alice and Bob upon running the protocol, and \({T_P(x,y,r)}\) is the transcript, comprising all messages and public coins. We omit \(r\) in the previous if \(P\) is deterministic. Let \(\mathbf {CC}(P)\) be the maximum number of bits communicated by \(P\) over all choices of inputs and random coins. Let \(\mathbf {D}(f) = \min _P \mathbf {CC}(P)\) where \(P\) ranges over all deterministic protocols computing \(f\). Let \(\mathbf {R}^{\epsilon }(f) = \min _P \mathbf {CC}(P)\) where \(P\) ranges over all randomized protocols computing \(f\) with error at most \(\epsilon \) on each input.

In the following paragraph we let \(P\) be a deterministic protocol that perfectly computes a function \(f\). For any input \((x,y) \in \mathcal {X}\times \mathcal {Y}\), the monochromatic \(f\)-region of \((x,y)\) is defined as \({\mathrm {D}^{f}_{x,y}} := f^{-1}(f(x,y))\), and is equal to the monochromatic \(P\)-region \({\mathrm {D}^{P}_{x,y}}\) of \((x,y)\). The monochromatic \(P\)-rectangle of \((x,y)\) is defined as \({\mathrm {D}^{T_P}_{x,y}} := T_P^{-1}(T_P(x,y))\) (the fact that this is a rectangle and not an arbitrary subset is a well-known consequence of \(P\) being a communication protocol). For any output \(z \in \mathcal {Z}\), the monochromatic \(f\)-region of \(z\) is: \({f^{-1}(z)} := f^{-1}(\{z\})\), which is equal to the monochromatic \(P\)-region of \(z\), \({P^{-1}(z)}\). Let \(\mathcal {R}^{P}_{z}\) be the set of \(P\)-rectangles covering \(P^{-1}(z)\), that is: \({\mathcal {R}^{P}_{z}} := \{\mathrm {D}^{T_P}_{x,y}| (x,y):\, P(x,y)=z\}\). Let \({\mathcal {R}^{P}} = \cup _{z\in \mathcal {Z}} \mathcal {R}^{P}_{z} = \left\{ \mathrm {D}^{T_P}_{x,y} \big | (x,y)\in \mathcal {X}\times \mathcal {Y}\right\} \) be the set of all \(P\)-rectangles. For each \(z\in \mathcal {Z}\), \({\mathrm {cut}_{P}(f^{-1}(z))}\) is the number of \(P\)-rectangles in \(f^{-1}(z)\); \(\mathcal {R}(\mathcal {X}\times \mathcal {Y})\) is the set of all rectangles in \(\mathcal {X}\times \mathcal {Y}\).

For three random variables \(A,B,C\) the conditional mutual information is defined as \({\mathbf {I}(A;B|C)}\) \(:= \mathbf {H}(A|C) - \mathbf {H}(A|BC)\), where \(\mathbf {H}\) denotes Shannon entropy: if \(X\) and \(Y\) are two random variables \(\mathbf {H}(X) = \sum _x \mathbb {P}\{X=x\} \log ( 1/ \mathbb {P}\{X=x\})\) and \(\mathbf {H}(X|Y) = \mathbb {E}[-\log (\mathbb {P}(X|Y))]\). We recall some simple facts about information and entropy (more details about information theory can be found in the textbook of Cover and Thomas [11].) For any random variables \(X,Y,Z,W\), the Chain Rule says that \( \mathbf {H}(X,Y) = \mathbf {H}(X) + \mathbf {H}(Y|X)\) and \( \mathbf {I}(X, Z;Y) = \mathbf {I}(X;Y) + \mathbf {I}(Z;Y|X)\). Another easy fact (see for example [1]) is that:

$$\begin{aligned} |\mathbf {I}(X;Y|W) - \mathbf {I}(X;Y|W,Z)| \le \mathbf {H}(Z) \end{aligned}$$
(1)

We let \(\mathsf {D}_{\mathsf {KL}}\) denote the KL-divergence, \(\mathsf {D}_{\mathsf {KL}}(X \parallel Y) = \mathbb {E}_{x \sim X} \log \frac{\Pr [X = x]}{\Pr [Y = x]}\). It is easy to see that \(I(X; Y) = \mathsf {D}_{\mathsf {KL}}(XY \parallel X' Y)\) where \(X'\) is an independent copy of \(X\). We will also use the following data processing inequality for KL-divergence (we include a proof in the appendix for the sake of completeness):

Lemma 1

For any \(X, Y\) and any deterministic function \(L\), the following holds:

$$\begin{aligned} \mathsf {D}_{\mathsf {KL}}(X \parallel Y) \ge \mathsf {D}_{\mathsf {KL}}(L(X) \parallel L(Y)) \end{aligned}$$
(2)

2.1 Definitions of Privacy

In the following, \((X,Y)\) denotes a pair of random variables, distributed according to \(\mu \), and \(P\) denotes a (possibly randomized) protocol.

Information Cost: We define the external and internal information cost, notions that have recently found many applications in communication complexity [2, 6, 7, 12]. The external information cost measures the amount of information that is learned from someone who looks at the messages exchanged between Alice and Bob during the protocol about their inputs. The internal information cost measures the amount of information that Alice learns about Bob’s input and vice versa.

Definition 1

The external information cost of \(P\) is defined as \({\mathrm {IC}^{ext}_{\mu }(P)} := \mathbf {I}(X,Y;T_P(X,Y,R))\). The external information cost of \(f\) is \({\mathrm {IC}^{ext}_{\mu ,\epsilon }(f)} := \inf _P \mathrm {IC}^{ext}_{\mu }(P)\) where the minimum is over all protocols \(P\) computing \(f\) with distributional error \(\epsilon \).

Definition 2

We define the internal information cost of \(P\) as \({\mathrm {IC}^{int}_{\mu }(P)} := \mathbf {I}(X;T_P(X,Y,R)|Y) + \mathbf {I}(Y;T_P(X,Y,R)|X)\). The internal information cost of \(f\) is \({\mathrm {IC}^{int}_{\mu ,\epsilon }(f)} := \inf _P \mathrm {IC}^{int}_{\mu }(P)\) where the minimum is over all protocols \(P\) computing \(f\) with distributional error \(\epsilon \).

Information-theoretic privacy: In [9], the definition of privacy (\(\mathcal {I}_{\mathrm {c-i}}^{\mathrm {det}}\) in their notations) is basically the same as what we now call \(\mathrm {IC}^{int}_{\mu }(P)\) (they used the max instead of the sum of the two terms). A related notion of privacy has been defined by Klauck in [15]. We give a distribution-dependent version of his definition. At a high level, it quantifies how much more an observer learns about the inputs from the transcript than from the value of the function. We also define an internal version of the definition. We assume that the output of a randomized protocol depends only on the transcript (i.e. \(P(x,y,r)\) is a deterministic function of \(T(x,y,r)\)).

Definition 3

The external privacy of \(P\) is defined as \({\mathrm {PRIV}_{\mu }^{ext}(f,P)} := \mathbf {I}(X,Y;T_{P}(X,Y,R)) - \mathbf {I}(X,Y;f(X,Y))\). For \(\epsilon \ge 0\), the external \(\epsilon \)-error privacy of \(f\) is defined as the following, where the infimum is taken over all protocols \(P\) computing \(f\) with distributional error at most \(\epsilon \): \( {\mathrm {PRIV}_{\mu ,\epsilon }^{ext}(f)} := \inf _P \mathrm {PRIV}_{\mu }^{ext}(f,P) \). We let \({\mathrm {PRIV}_{\mu }^{ext}(f)} := {\mathrm {PRIV}_{\mu ,0}^{ext}(f)}\).

Definition 4

The internal privacy of \(P\) is defined as \({\mathrm {PRIV}_{\mu }^{int}(f,P)} := \mathbf {I}(X;T_{P}(X,Y,R)|Y) - \mathbf {I}(X;f(X,Y)|Y) + \mathbf {I}(Y;T_{P}(X,Y,R)|X) - \mathbf {I}(Y;f(X,Y)|X).\) For \(\epsilon \ge 0\), the internal \(\epsilon \)-error privacy of \(f\) is defined as the following, where the infimum is taken over all protocols \(P\) computing \(f\) with distributional error at most \(\epsilon \): \( {\mathrm {PRIV}_{\mu ,\epsilon }^{int}(f)} := \inf _P \mathrm {PRIV}_{\mu }^{int}(f,P) \). We let \({\mathrm {PRIV}_{\mu }^{int}(f)} := {\mathrm {PRIV}_{\mu ,0}^{int}(f)}\).

It is easy to see that our definition is equivalent to the one in [15] for deterministic or zero-error protocols.

Combinatorial privacy \(\mathrm {PAR}\): We present here the definition of \(\mathrm {PAR}\) for deterministic protocols given by [1], which modified the original definition in [13] in order to measure the size of regions relative to the inputs’ distribution.

Definition 5

The external privacy-approximation ratio of a deterministic protocol \(P\) for \(f\) is defined as: \( {\mathrm {PAR}_{\mu }^{ext}(f,P)} := \mathbb {E}_{(x,y)\sim \mu } \left[ \tfrac{\left| {\mathrm {D}^{f}_{x,y}}\right| _{\mu }}{\left| {\mathrm {D}^{T_P}_{x,y}}\right| _{\mu }} \right] = \mathbb {E}_{(x,y)\sim \mu } \left[ \tfrac{\left| {\mathrm {D}^{P}_{x,y}}\right| _{\mu }}{\left| {\mathrm {D}^{T_P}_{x,y}}\right| _{\mu }} \right] \) (where the equality holds because \(P\) has zero error). The external privacy-approximation ratio of a function \(f\) is defined as: \( {\mathrm {PAR}_{\mu }^{ext}(f)} := \displaystyle \inf _{P} \mathrm {PAR}_{\mu }^{ext}(f,P)\) where the infimum is over all deterministic \(P\) computing \(f\) with zero error.

Definition 6

The internal privacy-approximation ratio of a deterministic protocol \(P\) for \(f\) is defined as: \( {\mathrm {PAR}_{\mu }^{int}(f, P)}:= \mathbb {E}_{(x,y)\sim \mu } \left[ \tfrac{\left| {\mathrm {D}^{f}_{x,y} \cap \mathcal {X}\times \{y\}}\right| _{\mu }}{\left| {\mathrm {D}^{T_P}_{x,y}\cap \mathcal {X}\times \{y\}}\right| _{\mu }} \right] + \mathbb {E}_{(x,y)\sim \mu } \left[ \tfrac{\left| {\mathrm {D}^{f}_{x,y}\cap \{x\} \times \mathcal {Y}}\right| _{\mu }}{\left| {\mathrm {D}^{T_P}_{x,y}\cap \{x\} \times \mathcal {Y}}\right| _{\mu }} \right] \). The internal privacy-approximation ratio of a function \(f\) is defined as: \( {\mathrm {PAR}_{\mu }^{int}(f)} := \displaystyle \inf _{P} \mathrm {PAR}_{\mu }^{int}(f, P)\) where the infimum is over all deterministic \(P\) computing \(f\) with zero error.

The external \(\mathrm {PAR}\) equals a weighted sum of the number of rectangles tiling each \(f\)-monochromatic region.

Theorem 9

([1]) For any deterministic protocol \(P\), we have: \(\mathrm {PAR}_{\mu }^{ext}(f,P) = \sum _{z\in \mathcal {Z}} \left| {f^{-1}(z)}\right| _{\mu } \cdot \mathrm {cut}_{P}(f^{-1}(z)).\)

This result was stated in [1] but for completeness we present a proof in the appendix.

We now extend the definition to randomized protocols. In the following, the expectations are taken over inputs \(x, y\) and random coins \(r\). A simple calculation shows that the following definition coincides with the definition of [1, 13] in the case of deterministic zero-error protocols.

Definition 7

We define:

  • The external \(\mathrm {PAR}\) of a randomized protocol \(P\) as:

    $$\begin{aligned} {\mathrm {PAR}^{ext}_{\mu }(f, P)} := \mathbb {E}_{x,y,r} \left[ \tfrac{\mathbb {P}_{X,Y,R}((X,Y) = (x,y) \, | \, T_P(X,Y,R) = T_P(x,y,r) )}{\mathbb {P}_{X,Y}((X,Y) = (x,y) \,|\, f(X,Y) = f(x,y))} \right] . \end{aligned}$$

    For \(\epsilon \ge 0\), the external \(\epsilon \)-error \(\mathrm {PAR}\) of \(f\) is defined as the following, where the infimum is taken over all protocols \(P\) computing \(f\) with error at most \(\epsilon \): \( {\mathrm {PAR}^{ext}_{\mu ,\epsilon }(f)} := \inf _P \mathrm {PAR}^{ext}_{\mu }(f, P) \).

  • The internal \(\mathrm {PAR}\) of a randomized protocol \(P\) as:

    $$\begin{aligned} {\mathrm {PAR}^{int}_{\mu }(f, P)} :=&\mathbb {E}_{x,y,r} \left[ \tfrac{\mathbb {P}_{X,Y,R}(Y = y \, | \, T_P(X,Y,R) = T_P(x,y,r) \wedge X=x )}{\mathbb {P}_{X,Y}(Y = y \,|\, f(X,Y) = f(x,y) \wedge X=x)} \right] \\&+ \mathbb {E}_{x,y,r} \left[ \tfrac{\mathbb {P}_{X,Y,R}(X = x \, | \, T_P(X,Y,R) = T_P(x,y,r) \wedge Y=y )}{\mathbb {P}_{X,Y}(X = x \,|\, f(X,Y) = f(x,y) \wedge Y=y)} \right] . \end{aligned}$$

    For \(\epsilon \ge 0\), the external \(\epsilon \)-error \(\mathrm {PAR}\) of \(f\) is defined as the following, where the infimum is taken over all protocols \(P\) computing \(f\) with error at most \(\epsilon \): \( {\mathrm {PAR}^{int}_{\mu ,\epsilon }(f)} := \inf _P \mathrm {PAR}^{int}_{\mu }(f,P) \).

Remark 1

There is another way to generalize the definition of PAR for 0-error protocols. This alternative definition is deferred to the appendix.

3 Relations Between Privacy Notions and Communication

We prove a number of relations between the different notions of privacy, communication complexity and information cost both for deterministic and randomized protocols. We summarize them in Fig. 1. In the diagram, an arrow \(A \leftarrow B\) indicates that \(A \le B\) (up to constants). The quantities indicate worst-case complexity except for Dist (see Theorem 13). Relations between:

  • \(\mathrm {PAR}\) and \(\mathrm {PRIV}\) are given in Theorem 8 (which was proved in [1] only for the deterministic 0-error internal case);

  • \(\mathbf {D}\) (resp. \(\mathbf {R}^{\epsilon }\)) and \(\mathrm {PAR}\) is given by Theorem 11;

  • \(\mathrm {IC}\) and \(\mathrm {PRIV}\) are given in Theorem 12 (which was proved in [1] only for the deterministic 0-error internal case);

  • The expected distributional complexity and \(\mathrm {IC}\) (or \(\mathrm {PRIV}\)) for every possible input distribution is given in Theorem 13;

  • \(\mathrm {PRIV}^{\mathrm {ext}}\) and \(\mathrm {PRIV}^\mathrm {int}\) is given in Theorem 14;

  • \(\mathrm {PAR}^{\mathrm {ext}}\) and \(\mathrm {PAR}^{\mathrm {int}}\) comes from Theorem 15 (for the deterministic case).

We start by proving that PRIV provides a lower bound for the \(\log \) of PAR:

Fig. 1.
figure 1

Lower bounds diagrams for deterministic and bounded error cases

3.1 Relations Between the Different Notions of Privacy and Communication Complexity

We provide below the proof of Theorem 8. The other theorems are proven in the appendix.

Theorem 8

[restated] For any input distribution \(\mu \) and any (deterministic or randomized) protocol P, it holds that \(\mathrm {PRIV}_{\mu }^{ext}(f, P) \le \log \left( \mathrm {PAR}^{ext}_\mu (f, P) \right) and \mathrm {PRIV}_{\mu }^{int}(f, P) \le 2\cdot \log \left( \mathrm {PAR}^{int}_\mu (f, P) \right) -2\). As a consequence, \(\forall \mu , f, \epsilon \) it holds that \(\mathrm {PRIV}_{\mu ,\epsilon }^{ext}(f) \le \log \left( \mathrm {PAR}^{ext}_{\mu ,\epsilon }(f) \right) and \mathrm {PRIV}_{\mu ,\epsilon }^{int}(f) \le 2 \cdot \log \left( \mathrm {PAR}^{int}_{\mu ,\epsilon }(f) \right) -2\).

Proof

For external privacy, this is a consequence of Bayes rule, and for internal privacy, this is a consequence of Bayes rule and an argument about the worse of the two terms comprising internal PRIV and PAR. The details of this proof will appear in the full version of the article ([17]).

Remark 2

Note that when the protocol is externally (resp. internally) perfectly private, the inequality is tight since \(\mathrm {PRIV}_{\mu }^{ext}(f, P)=0\) and \(\mathrm {PAR}^{ext}_\mu (f, P)=1\) (resp. \(\mathrm {PRIV}_{\mu }^{int}(f,P)=0\) and \(\mathrm {PAR}^{int}_\mu (f,P)=2\)).

3.2 Applications: Tight Bounds on PAR and PRIV for Specific Functions

For boolean functions, \(\mathrm {PRIV}\) is essentially lower bounded by Information Cost, which subsumes almost all known lower bounds for communication complexity, i.e. smooth rectangle, \(\gamma _2\)-norm, discrepancy, etc [16]. Hence, Theorem 8 implies

Corollary 1

[restated] For all \(f, \mu , \epsilon \) such that \(\mathbf {R}^{\epsilon }(f) = O(\mathrm {IC}^{int}_{\mu ,\epsilon }(f))\), it holds that \(\log \mathrm {PAR}_{\mu ,\epsilon }(f) = \mathrm {PRIV}_{\mu ,\epsilon }(f) = \mathbf {R}^{\epsilon }(f)\) up to constant factors (for both internal and external notions).

Interestingly, the notion of \(\mathrm {PAR}\) sits between information and communication complexity, and it is an important open question whether these two notions are equal (which would also make \(\mathrm {PAR}\) equal to them). For the bounds in Table 2, the results follow immediately from known lower bounds on the \(\mathrm {IC}\) of these functions: for \(\mathrm {EQ}\) the lower bound is trivial, for \(\mathrm {DISJ}\) one can look at \(\mathrm {IC}\) directly [6, 7], while for \(\mathrm {EQ}, \mathrm {IP}, \mathrm {GT}\) one can look at their discrepancies [10]. Then, using Theorem 12 and 8 we obtain bounds on internal \(\mathrm {PAR}\). Note that the bounds also hold for external \(\mathrm {PRIV}\) and \(\mathrm {PAR}\) (since internal is always at most external, see Theorem 14). Moreover, we can also get similar lower bounds for the functions Vector in Subspace and Gap-Hamming distance by the results in [16].

4 New Lower Bound Techniques for \(\mathrm {PAR}\) and \(\mathrm {PRIV}\) of Deterministic Protocols

In Subsects. 4.1 and 4.2, we assume \(\mu \) to be full-support for simplicity. By restricting the summations to the rectangles that intersect the support of \(\mu \), it is possible to get similar results for a general distribution.

4.1 External \(\mathrm {PAR}\) of Boolean Functions

Let \(f\) be a boolean function, \(P\) a deterministic protocol for \(f\) and \(T\) its transcript. Let \(n_0\) and \(n_1\) be the number of \(P\)-rectangles with output 0 and 1 (\(n_0 = |\mathcal {R}^{P}_{0}|, n_1 = | \mathcal {R}^{P}_{1}|\)). We lower bound \(\mathrm {PAR}\) by the communication matrix rank.

Theorem 1

[restated] For boolean \(f\), for any distribution \(\mu \) with full support,

$$\begin{aligned} \mathrm {PAR}_{\mu }^{ext}(f) \ge \min \{\mathrm {rank}\left( \mathcal {M}_{f}\right) \mathrm {rank}\left( \mathcal {M}_{\mathrm {not} f}\right) \} \ge \mathrm {rank}\left( \mathcal {M}_{f}\right) - 1. \end{aligned}$$

The proof will appear in the full version of the article ([17]) and uses Theorem 9 in [1]. Moreover, we are going to use the following result of Yannakakis, which restated in our notation says that

Lemma 2

(Lemma 1 in [21]) For boolean \(f\) and any deterministic protocol \(P\), \(\log \min (n_0,n_1) \ge \sqrt{\mathbf {D}(f)}\).

In fact, Yannakakis proves only that \(\log n_1 \ge \sqrt{\mathbf {D}(f)}\), but it is easy to verify that the proof is independent of the value of the monochromatic rectangles, so it similarly follows for the 0-rectangle case. Using in addition the fact that \( \mathrm {PAR}_{\mu }^{ext}(f) \ge \min (n_0,n_1) \), we have

Theorem 2

[restated] For boolean \(f\), for any distribution \(\mu \) with full support, \( \log \mathrm {PAR}_{\mu }^{ext}(f) \ge \sqrt{\mathbf {D}(f)}.\)

Note that Theorem 1 is not true in general for non-boolean functions (see Appendix).

4.2 External \(\mathrm {PAR}\) for Non-boolean Functions

Definition 8

Let \(\widetilde{{\mathrm {PAR}}}_{\mu }(f)\) be the value of the following linear program:

$$\begin{aligned} \min _{w_{z,R}} \sum _{z,R} w_{z,R} \cdot \left| {f^{-1}(z)}\right| _{\mu } \quad s.t. \quad&\forall \, (x,y) \in f^{-1}(\mathcal {Z}): \sum _{R: R \ni (x,y)} w_{f(x,y),R} = 1\end{aligned}$$
(3)
$$\begin{aligned}&\forall \, (x,y) \in f^{-1}(\mathcal {Z}): \; \sum _{R: R \ni (x,y)} \sum _z w_{z,R} = 1\end{aligned}$$
(4)
$$\begin{aligned}&\forall \, z, \ \forall \, R: \quad w_{z,R} \ge 0. \end{aligned}$$
(5)

where the \(z\)’s and the \(R\)’s are always taken respectively in \(\mathcal {Z}\) and in \(\mathcal {R}(\mathcal {X}\times \mathcal {Y})\).

Intuitively, from conditions (4) and (5), we can interpret \(w_{z,R}\) as a probability distribution. In fact, \(w_{z,R}\) is the probability to pick \(R\) and outputs \(z\) on \((x,y)\). This is because condition (3) forces the probability of outputting \(f(x,y)\) on \((x,y)\) to be 1.

Theorem 3

[restated] For all \(f\), for any distribution \(\mu \) with full support, \( {\mathrm {PAR}_{\mu }^{ext}(f) \ge \widetilde{{\mathrm {PAR}}}_{\mu }(f).} \)

Proof

Let \(P\) be a deterministic protocol for \(f\) and \(T\) its transcript. We can show that \(w_{z,R} := \mathbf {1}_{R\in \mathcal {R}^{P}_{z}}\) satisfies the conditions of Definition 8 and deduce the lower bound. The details of this proof will appear in the full version of the article ([17]).

Relation with rectangle linear program: We relate this linear program to the rectangle bound defined in [14]. For uniform output distribution, we can generalize this relation to the partition bound (see Appendix).

Definition 9

\(\mathrm {rec}^{z}(f)\) is the optimal value of the following linear program, where \(R\) is taken in \(\mathcal {R}(\mathcal {X}\times \mathcal {Y})\):

$$\begin{aligned} \min _{w_R} \sum _R w_R \quad \text {s.t. } \quad&\forall (x,y) \in f^{-1}(z): \sum _{R: R \ni (x,y)} w_R = 1\end{aligned}$$
(6)
$$\begin{aligned}&\forall (x,y) \in \mathcal {X}\times \mathcal {Y}\, \backslash \, f^{-1}(z): \quad \sum _{R: R \ni (x,y)} w_R = 0\end{aligned}$$
(7)
$$\begin{aligned}&\forall R : \quad w_R \ge 0. \end{aligned}$$
(8)

Theorem 4

[restated] For all f, for any distribution \(\mu \) with full support, \( \widetilde{{\mathrm {PAR}}}_{\mu }(f) \ge \sum _{z \in \mathcal {Z}} \left| {f^{-1}(z)}\right| _{\mu } \cdot \mathrm {rec}^{z}(f).\)

The proof will appear in the full version of the article ([17]).

Relation between \(\mathrm {PAR}\) and fooling sets: Recall that a z-fooling set (\(z \in \mathcal {Z}\)) for \(f: \mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) is a subset \(F_z \subseteq f^{-1}(z)\) such that: \(\forall \, (x,y) \in F_z, \ f(x,y) = z\) and \(\forall \, (x_1,y_1), (x_2,y_2) \in F_z, \ (x_1,y_1) \ne (x_2,y_2)\) it holds that \(f(x_1,y_2) \ne z \text{ or } f(x_2,y_1) \ne z\). By Theorem 9 in [1] and the following theorem we lower bound PAR by fooling sets.

Theorem 10

([18]) If \(F_z\) is a \(z\)-fooling set for \(f\), then any covering of \(f^{-1}(z)\) by monochromatic rectangles has at least \(|F_z|\) rectangles.

Theorem 5

[restated] For all f and any set of z-fooling sets \(\{F_z\}_{z \in \mathcal {Z}}\), for any distribution \(\mu \) with full support, \( \mathrm {PAR}_{\mu }^{ext}(f) \ge \sum _{z \in \mathcal {Z}} \left| {f^{-1}(z)}\right| _{\mu } \cdot | F_z|.\)

4.3 New Lower Bound Techniques for External \(\mathrm {IC}\)

We show lower bounds on the external information complexity, which using Theorem 12 will in turn give new lower bounds on information-theoretic privacy. Our lower bounds hold for zero-error randomized protocols, which of course imply the same bounds for deterministic protocols.

Theorem 6

[restated] Fix a function f. Suppose there exists \(\delta > 0\) and a distribution \(\mu \) over the inputs of f, such that for all monochromatic rectangles R of f, \(\mu (R) \le \delta \). Then it holds for every P that computes f without error on any input (i.e. even on pairs of inputs lying outside \(\mu '\mathrm{s}\) support) that \(\mathrm {IC}^{ext}_{\mu }(P) \ge \log (1/\delta )\).

The proof will appear in the full version of the article ([17]).

Corollary 2.

For any function \(f\) with a fooling set \(S\) of size \(|S| = k\), there exists a distribution \(\mu \) such that for all protocols \(P\) that compute \(f\) with zero error over \(\mu \), it holds that \(\mathrm {IC}^{ext}_{\mu }(P) \ge \log k\).

The proof of this corollary will appear in the full version of the article ([17]). Note that Theorem 6 can be used to prove an optimal lower bound on the zero-error information complexity of certain functions. For example, for one bit \(\mathrm{{AND}}\), the hard distribution \(\mu \) is uniform over (0, 1), (1, 0), (1, 1), and our theorem implies that \(\mathrm {IC}^{ext}_{\mu }(P) \ge \log _2 3\). This matches a recent exact bound (which is in particular an upper bound) by Braverman et al. [4].

4.4 Applications: Tight Bounds on External PAR and PRIV for Specific Functions

Our applications in Table 1 follow from the lower bounds techniques that we have seen and applying well known facts about the rank or the size of the fooling sets of the communication matrix of the functions in question.

The proofs will appear in the full version of the article ([17]).

5 Quality of the Two Definitions

5.1 Privacy for Deterministic Protocols

Deterministic protocols: for deterministic protocols, the two definitions of privacy, \(\mathrm {PRIV}\) and \(\mathrm {PAR}\), can be arbitrarily different for the same distribution. In high level, \(\mathrm {PRIV}\) captures the expected privacy loss of a protocol, while \(\mathrm {PAR}\) captures a more “risk-averse” notion of privacy, where a protocol is penalized heavily for high-privacy-loss events, even if they occur with small probability.

In the appendix, we show that this difference makes \(\mathrm {PRIV}\) a much more robust definition: an \(\epsilon \) change in the input distribution causes at most an \(\epsilon n\) change in \(\mathrm {PRIV}\), so \(\mathrm {PRIV}\) is “smooth”. Furthermore, \(\mathrm {PRIV}\) always remains less than the expected communication of the protocol, which we believe to be another natural property. We prove that this is not the case for \(\mathrm {PAR}\): sometimes an \(\epsilon \) change in the input distribution can cause \(\mathrm {PAR}\) to change exponentially, and \(\mathrm {PAR}\) can grow arbitrarily larger than the expected communication. Finally we also point out an error in the appendix of [13] and show that for the example they gave, in fact \(\mathrm {PRIV}\) is just as good as \(\mathrm {PAR}\) at distinguishing two protocols in their example.

Bounded-error case: As we explained in Sect. 3.2, in the case of bounded-error randomized protocols, the two notions of privacy are in fact both equal to the communication complexity for all boolean functions for which we have a tight bound on their communication complexity. Moreover, for functions with large output, we still do not have any example where \(\mathrm {PRIV}\) and \(\mathrm {PAR}\) are different when we are allowed bounded error.