Practical Multi-party Private Set Intersection Cardinality and Intersection-Sum Under Arbitrary Collusion

Chen, You; Ding, Ning; Gu, Dawu; Bian, Yang

doi:10.1007/978-3-031-26553-2_9

You Chen⁹,
Ning Ding⁹,
Dawu Gu⁹ &
…
Yang Bian¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13837))

Included in the following conference series:

International Conference on Information Security and Cryptology

1147 Accesses
1 Citations

The original version of this chapter was revised: this chapter contained errors on page 8, 9, 10 & 11 in chapter 9 which is indicated in our final book. The correction to this chapter is available at https://doi.org/10.1007/978-3-031-26553-2_27

Abstract

Private set intersection cardinality (PSI-CA) and private intersection-sum with cardinality (PSI-CA-sum) are two primitives that enable data owners to learn the intersection cardinality of their data set, with the difference that PSI-CA-sum additionally outputs the sum of the associated integer values of all the data that belongs to the intersection (i.e., intersection-sum). In this paper, we investigate the practical constructions of these two primitives, focusing on the multi-party setting. To our knowledge, all existing multi-party PSI-CA (MPSI-CA) protocols are either impractical or vulnerable to arbitrary collusion (i.e., the adversary can corrupt any proper subset of all parties), and as for multi-party PSI-CA-sum (MPSI-CA-sum), there is even no formalization for this notion at present, not to mention secure constructions for it.

So in this paper, we first propose the first MPSI-CA protocol that achieves simultaneous practicality and security against arbitrary collusion (in the semi-honest adversary model). We also conduct implementation to verify its practicality (while the previous results under arbitrary collusion only present theoretical analysis of performance, lacking real implementation). Numeric results show that it only takes 12.805 s to finish the online computation by shifting expensive operations to an offline phase, even in the dishonest majority setting with 15 parties each holding \(2^{16}\) data. Among all parties, the cost of clients is especially lower compared to that of the known results, which is only 0.3 s in finishing their tasks.

Second, we formalize the notion of MPSI-CA-sum and give the first realization which admits simultaneous practicality and security against arbitrary collusion as well. The computational complexity of it is roughly double that of our MPSI-CA protocol.

Besides the main results, we introduce the notions and provide efficient constructions of two new building blocks: multi-party secret-shared shuffle and oblivious zero-sum check, which may be of independent interest.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Scalable Multi-party Private Set Union from Multi-query Secret-Shared Private Membership Test

A Scalable Multiparty Private Set Intersection

Efficient and Collusion Resistant Multi-party Private Set Intersection Protocols for Large Participants and Small Sets Setting

Keywords

1 Introduction

Motivation. Private set intersection cardinality (PSI-CA) is a cryptographic primitive that enables multiple parties to learn the intersection cardinality of their private data sets without leaking other information beyond the intersection cardinality. PSI-CA can be applied to real-world applications like measuring advertisement conversion rates [10] and so on. Despite its broad usage, nevertheless, PSI-CA is still not sufficient for some applications where each data is associated with an integer value (e.g. payload), like measuring advertisement conversion rates when one person contributes multiple purchases [10]. Thus a variant of PSI-CA is proposed, known as private intersection-sum with cardinality (PSI-CA-sum) [10], which is specified to output the intersection cardinality, as well as the sum of associated payloads for all the elements that belong to the intersection (i.e., intersection-sum).

Besides measuring advertisement conversion rates, we come up with the following possible application of PSI-CA-sum. Consider a score-based voting scenario with multiple voters, where voter \(P_{i}\) can vote for any candidate \(s\in \{0,1\}^{*}\) that he prefers, and the ballot of him is associated with a score for candidate s (s is the candidate’s ID). If \(P_{i}\) does not vote for candidate s, then there is no need for him to give s a score. \(P_{i}\)’s voting result is represented using a set \(S_{i}=\{(s_{i,1},v_{i}(s_{i,1}),...,(s_{i,m},v_{i}(s_{i,m})\}\) of size m, where \(s_{i,k},k\in [m]\) are the IDs of his chosen candidates and \(v_{i}(s_{i,k})\) is his score of candidate \(s_{i,k}\). Given the set \(S_{i},i\in [n]\) of n voters, the set of common candidates supported by all voters is denoted as set intersection IS. The total score of a common candidate s is \(\sum _{i=1}^{n} v_{i}(s)\), which can be used to calculate the average score of every common candidate. In this problem setting, the required information consists of the intersection cardinality |IS| and the sum of common candidates’ scores \(Sum_{IS}\) (i.e., \(Sum_{IS}=\sum _{i=1}^{n}\sum _{x \in IS}v_{i}(x)\)), so that the average score of a common candidate is \(Sum_{IS}/|IS|\). Here, PSI-CA-sum can be employed to securely obtain the average score without additional information leakage.

However, most existing PSI-CA protocols work in the two-party setting, while the results of multi-party PSI-CA (MPSI-CA) are either limited by massive computational overhead, or vulnerable to arbitrary collusion (i.e., the adversary can corrupt any proper subset of all parties [15]). Meanwhile, to the best of our knowledge, there has been no work for multi-party PSI-CA-sum (MPSI-CA-sum). Therefore, we will address the problems and aim at formalizing the notion of MPSI-CA-sum, proposing protocols for MPSI-CA and MPSI-CA-sum that can achieve simultaneous practicality and security against arbitrary collusion.

1.1 State of the Art of MPSI-CA

Although there have been some effective two-party PSI-CA schemes [5, 7, 13], only a small number of works can deal with the multi-party setting [1, 2, 11, 17].

Existing constructions of PSI-CA protocols can be generally classified into three categories, depending on whether the protocol is based on circuits, public key operations, or oblivious transfer (OT) and its extensions, say oblivious programmable pseudorandom function (OPPRF). Previous MPSI-CA schemes secure against arbitrary collusion typically follow public-key-based paradigm, and their computational complexities are determined by the number of expensive public key operations. Kissner and Song [11] proposed the first MPSI-CA protocol in the semi-honest model. This protocol relies on polynomial evaluation and homomorphic encryption (HE), and the overall computational complexity of it is \(O\left( n^{2} m_{\max }^{2}\right) \), where n is the number of parties and \(m_{max}\) is the maximum set size. Debnath et al. [2] presented an MPSI-CA protocol based on inverse bloom filter (IBF) and HE. The protocols in [2, 11] are both proven secure against arbitrary collusion. Despite their good properties in privacy preserving, it is impractical for resource-limited devices with large data sets to carry out these protocols due to the massive computational overhead.

To tackle with this problem, two practical schemes have been proposed. Chandran et al. [1] introduced a circuit-based generic multi-party computation protocol, which can be extended to realize MPSI-CA by modifying the circuit. However, this protocol is only proven secure with honest majority in semi-honest model. Besides, a concurrent work of [17] presented two OPPRF-based MPSI-CA protocols under the additional assumption that specific parties are non-colluding, which deviate from the well-known “threshold security”. Although assuming the existence of some specific non-colluding parties can improve the performance, it is believed that the “threshold security” is closer to real life applications for the following reasons: (1) There may not always exist such well-established non-colluding parties to participate in the protocol; (2) The identities of corrupted parties may be kept secret to honest parties, so it is unrealistic to assume that specific parties are non-colluding and to appoint them to perform special tasks.

Therefore, how to design and implement a practical semi-honest secure MPSI-CA scheme under arbitrary collusion is still worth studying.

1.2 State of the Art of Two-Party PSI-CA-Sum

(Since there is no result of PSI-CA-sum in the multi-party setting) we sketch some known results on the two-party PSI-CA-sum [7, 9, 10]. Motivated by the business problem of online-to-offline advertisement conversions, Ion et al. [10] introduced the first two-party PSI-CA-sum protocol by applying the classic Diffie-Hellman style construction into this new scenario. The protocol then was further polished in [9] and developed into two new constructions, built on modern techniques like random OT, which nevertheless rely on expensive HE as a building block for aggregating intersection-sum. Garimella et al. [7] put forward a lightweight two-party PSI-CA protocol by adopting oblivious switching network and OT to successfully avoid the reliance on HE.

1.3 Our Contributions

In this paper we formalize the notion of MPSI-CA-sum and propose the first MPSI-CA protocol and MPSI-CA-sum protocol that can achieve simultaneous practicality and security against arbitrary collusion. Details are as follows.

MPSI-CA Under Arbitrary Collusion. Our MPSI-CA protocol admits the following properties and advantages.

It is the first practical realization of MPSI-CA under arbitrary collusion to our knowledge, and we also conduct an implementation to verify its practicality (while the previous results under arbitrary collusion only present theoretical analysis of performance without real implementation).
The cost of clients is especially lower than the existing schemes with the same security.
Its computational efficiency is attributed to the element sharing technique and underlying lightweight primitives, which do not require any public key operations besides a set of base OTs.
In our implementation, most of the expensive operations can be shifted to an offline phase to significantly reduce the running time of online computation. Numeric results show that even in the dishonest majority setting with 15 parties each holding \(2^{16}\) data, it only takes 12.805 s to finish the online computation, which is about one fourth of the original running time.

Table 1 compares our MPSI-CA protocol with current MPSI-CA schemes with respect to security and computational complexity. On one hand, when compared to the existing practical schemes [1, 17], our protocol is more secure, since the existing schemes are not resistant to arbitrary collusion (remark that our protocol is also of practicality which is incomparable to the schemes in [1, 17] due to different running frameworks). On the other hand, when compared to the existing schemes secure against arbitrary collusion [2, 11], our protocol is much more practical, since it adopts a set of base OTs and symmetric key operations to reduce the number of expensive public key operations.

Table 1. Comparison between MPSI-CA schemes

Full size table

MPSI-CA-sum Under Arbitrary Collusion. We formalize the notion of MPSI-CA-sum and propose the first MPSI-CA-sum protocol that achieves simultaneous practicality and security against arbitrary collusion. Its computational complexity is roughly double that of our MPSI-CA protocol. Compared with most two-party PSI-CA-sum schemes, our protocol avoids the usage of expensive HE in aggregating intersection-sum, thus greatly reducing the computational cost.

Additional Contributions. Besides the main contributions, we also introduce the new notions and efficient constructions of two new building blocks of our MPSI-CA and MPSI-CA-sum protocols: multi-party secret-shared shuffle and oblivious zero-sum check.

Multi-party secret-shared shuffle helps multiple parties jointly shuffle the sum of their input data in an unknown permutation \(\pi \) and obtain additive secret shares of the result. It is an advancement of the multi-party Permute+Share [14] because it can hide \(\pi \) even when confronted with arbitrary collusion. Our construction is practical since its costly operations can be shifted to an offline phase.
Oblivious zero-sum check is a primitive that can securely determine whether the sum of multiple parties’ inputs is 0 without revealing anything else. Our construction of oblivious zero-sum check employs Beaver triples to reduce online computational overhead.

1.4 High-Level Description

In this part, we present a high-level overview of our MPSI-CA and MPSI-CA-sum protocols. Our protocols involve n parties, including \(T=t+1\) leaders \(L_{1},...,L_{T}\) and \(n-T\) clients \(P_{1},...,P_{n-T}\), where t is the corruption threshold (t can be up to \(n-1\)). In order to differentiate between leaders, leader \(L_{1}\) is called the primary leader, and the rest of the leaders are called secondary leaders. Each party holds a private set with size m. The data set of the i-th leader \(L_{i}\) is \(X_{i},i \in [T]\), and that of the j-th client \(P_{j}\) is \(S_{j},j \in [n-T]\).

As shown in Fig. 1, in the setting of MPSI-CA, clients first share their encoded data sets to leaders through element sharing, so that the original n-party MPSI-CA problem can be reduced to T-party MPSI-CA of T leaders, where \(T=t+1\). Then, primary leader \(L_{1}\) invokes OPPRFs with all secondary leaders \(L_{i},i\in [2,T]\) on each element \(x_{1,k}\in X_{1}\). If \(x_{1,k}\) belongs to the intersection, then the sum of all leaders’ outputs and \(L_{1}\)’s element sharing on \(x_{1,k}\) equals 0, which is denoted as \(t_{k}\). After participating in T-party secret-shared shuffle, each leader \(L_{i}\) obtains a random additive share of shuffled set \(\{t_{\pi (k)}\}_{k\in [m]}\), where the shuffle order \(\pi \) is kept secret to all parties. Finally, leaders perform oblivious zero-sum check to securely calculate the number of elements that satisfy \(\gamma _{k} t_{\pi (k)}=0\), where the random value \(\gamma _{k}\) is unknown to any leader. If \(\gamma _{k} t_{\pi (k)}=0\), then \(L_{1}\) adds one to intersection cardinality, otherwise the value of \(t_{\pi (k)}\) will not be revealed.

In the setting of MPSI-CA-sum, parties need to perform element sharing (payload sharing), OPPRF and secret-shared shuffle on both elements and their associated payloads. After running oblivious zero-sum check on elements, \(L_{1}\) can obtain a binary vector \(\vec {e}\), which indicates the shuffled indices of elements that belong to the intersection. As for those elements, \(L_{1}\) invokes OTs with all other leaders using choice string \(\vec {e}\) to aggregate the sum of their associated payloads.

1.5 Organizations

Section 2 introduces the preliminaries. In Sect. 3, the notions and constructions of two new building blocks are presented. We propose the practical MPSI-CA and MPSI-CA-sum protocols in Sect. 4 and 5, respectively. The computational complexity of MPSI-CA-sum protocol is roughly double that of our MPSI-CA protocol, therefore we focus on implementing and analyzing the performance of our MPSI-CA protocol in Sect. 6.

2 Preliminaries

Notations. We use \(\kappa \) and \(\lambda \) to denote the computational and statistical security parameters. The set \(\{1,2, \ldots , x\}\) is denoted as [x] (thus \(\sum _{i=1}^{T}\) is equivalent to \(\sum _{i\in [T]}\)). If the elements of a set \(\{x_{1},\dots ,x_{m}\}\) are arranged in order, then this set can be expressed in the form of a vector \(\vec {x}=(x_{1},\dots ,x_{m})\). Therefore, \(\vec {x}+ \vec {y}\) means performing addition on corresponding elements in two sets x and y to obtain \(\{x_{1}+ y_{1},\dots ,x_{m}+ y_{m}\}\). Given a permutation \(\pi \) and a set \(\vec {x}=(x_{1},\dots ,x_{m})\), we represent the operation of shuffling the positions of elements in this set using permutation \(\pi \) with \(\pi (\vec {x})=(x_{\pi (1)},\dots ,x_{\pi (m)})\). The set intersection is denoted as IS, and the intersection cardinality is |IS|.

Security Definitions. The parties corrupted by a semi-honest adversary \(\mathcal {A}\) will faithfully follow the protocol, while attempting to learn about other parties’ inputs. Moreover, those corrupted parties will collude with each other. By “non-colluding parties”, we mean that at most one of those parties can be corrupted by \(\mathcal {A}\); while “arbitrary collusion” means that \(\mathcal {A}\) may corrupt any proper subset of all parties, which is the most challenging case. The coalition of corrupted parties is denoted as \(\mathcal {C}\). Let \(\varPi \) be a protocol and f be a deterministic functionality. We define the following distributions of random variables and use the real-ideal simulation paradigm to formally define the semi-honest security of \(\varPi \) [6]. In this paper, we prove the security of all the protocols based on Definition 1.

\({\text {Real}}_{\varPi }\left( \kappa , \mathcal {C};x_{1},\dots ,x_{n}\right) \): Each party \(P_{i}\) runs the protocol honestly using private input \(x_{i}\) and security parameter \(\kappa \). Output \(\left\{ V_{i} \mid i \in \mathcal {C}\right\} ,\left( y_{1}, \ldots , y_{n}\right) \), where \(V_{i}\) and \(y_{i}\) denote the final view and output of party \(P_{i}\).
\({\text {Ideal}}_{f, \mathcal {S}}\left( \kappa , \mathcal {C} ; x_{1}, \ldots , x_{n}\right) \): Compute \(\left( y_{1}, \ldots , y_{n}\right) \leftarrow f\left( x_{1}, \ldots , x_{n}\right) \). Output \(\mathcal {S}\left( \mathcal {C},\left\{ \left( x_{i}, y_{i}\right) \mid i \in \mathcal {C}\right\} \right) ,\left( y_{1}, \ldots , y_{n}\right) \), where \(\mathcal {S}\) is a probabilistic polynomial time (PPT) simulator.

Definition 1

[6] We say that protocol \(\varPi \) securely computes f in the presence of a semi-honest adversary, if there exists a PPT simulator \(\mathcal {S}\) such that for \(\mathcal {C}\) and all inputs \(x_{1},\dots ,x_{n}\), the distributions \({\text {Real}}_{\varPi }\left( \kappa , \mathcal {C};x_{1},\dots ,x_{n}\right) \) and \({\text {Ideal}}_{f, \mathcal {S}}\left( \kappa , \mathcal {C} ; x_{1}, \ldots , x_{n}\right) \) are computationally indistinguishable in \(\kappa \).

Oblivious Key-Value Store (OKVS). The definitions of key-value store (KVS) and OKVS were first given in [8]. An OKVS is a generalized data structure that stores the mapping from keys to their values, and it can be instantiated with polynomial, garbled bloom filter (GBF) [4] and so on.

Definition 2

[8] A KVS is parameterized by a set \(\mathcal {K}\) of keys and a set \(\mathcal {V}\) of values, and consists of two algorithms: (1) \({\text {Encode}}\) takes as input a set of \(\left( k_{i}, v_{i}\right) \) key-value pairs and outputs an object S (or, with statistically small probability, an error indicator \(\perp \)); (2) \({\text {Decode}}\) takes as input the object S, a key k and outputs a value v. A KVS is correct if, for all \(A \subseteq \mathcal {K} \times \mathcal {V}\) with distinct keys: \((k, v) \in A \text{ and } \perp \ne S \leftarrow {\text {Encode}}(A) \Longrightarrow {\text {Decode}}(S, k)=v\)

A KVS is an OKVS if, for any two sets \(\mathcal {K}^{0}, \mathcal {K}^{1}\) of m distinct keys, the output of \(\mathcal {R}\left( \mathcal {K}^{1}\right) \) is computationally indistinguishable to that of \(\mathcal {R}\left( \mathcal {K}^{0}\right) \), where:

Oblivious Programmable Pseudorandom Function (OPPRF, \(\boldsymbol{\mathcal {F}_\textrm{opprf}^{F,m,u}}\)). The formal definition of OPPRF was first given in [12], which also provided a semi-honest secure realization. An OPPRF takes as input the queries \(\left( q_{1}, \ldots , q_{u}\right) \) from receiver and a programmed set \(\mathcal {P}=\{\langle {x_{i}, y_{i}\rangle }\}_{i\in [m]}\) from sender. Then, the receiver’s OPPRF outputs satisfy the following property: if the query \(q_{j}=x_{i} \in \mathcal {P}\), then its OPPRF output equals \(y_{i}\), otherwise the output is pseudorandom. Generally speaking, receiver’s OPPRF outputs are fixed at some selected points.

Multi-party Pemute+Share (\(\boldsymbol{\mathcal {F}_{\text {mPS}}^{T,m,i}}\)). \(\mathcal {F}_{\text {mPS}}^{T,m,i}\) takes as input the vectors \(\vec {x_{j}}\) from all parties \(P_{j},j\in [T]\) and a permutation \(\pi _{i}\) from sender \(P_{i}\), then outputs additive shares of shuffled sum \(\pi _{i}(\sum _{j=1}^{T} \vec {x}_{j})\) to every party. The functionality of \(\mathcal {F}_{\text {mPS}}^{T,m,i}\) was given in [14], along with an realization of \(\mathcal {F}_{\text {mPS}}^{T,m,i}\) based on OT and switching network, which is proven secure against a semi-honest adversary which may corrupt up to \(T-1\) parties. \(\mathcal {F}_{\text {mPS}}^{T,m,i}\) is an essential building block of our multi-party secret-shared shuffle primitive proposed in Sect. 3.

3 Two New Primitives and Constructions

In this section, we present the notions and constructions of two new building blocks for our MPSI-CA and MPSI-CA-sum protocols, namely the multi-party secret-shared shuffle and oblivious zero-sum check.

3.1 Multi-party Secret-Shared Shuffle

We formalize the new notion of multi-party secret-shared shuffle, and give a realization of it. It can help parties shuffle the sum of their inputs in an unknown permutation order \(\pi \), and obtain additive shares of the result.

Functionality (\(\boldsymbol{\mathcal {F}_{\text {mSS}}^{T,m}}\)). \(\mathcal {F}_{\text {mSS}}^{T,m}\) can be regarded as an advancement of the original \(\mathcal {F}_{\text {mPS}}^{T,m,i}\), since it ensures that none of the parties gets to know the permutation \(\pi \). \(\mathcal {F}_{\text {mSS}}^{T,m}\) receives permutations \(\pi _{i}\) and vectors \(\vec {x_{i}}\) from all parties \(P_{i},i\in [T]\), and gives them the additive shares of shuffled sum of inputs \(\pi (\sum _{i\in [T]} \vec {x}_{i})\) as outputs.

Protocol. We propose a protocol to realize \(\mathcal {F}_\textrm{mSS}^{T,m}\) as follows. This protocol invokes T rounds of T-party Permute+Share [14] in an iterative way. During the i-th round, \(P_{i}\) acts as the sender who provides permutation \(\pi _{i}\) and vector \(\vec {x}_{i}^{(i-1)}\), others act as receivers with vectors \(\vec {x}_{j}^{(i-1)},j\in [T] \backslash \{i\}\) (Here, \(\vec {x}_{j}^{(0)}=\vec {x}_{j}\), \(j\in [T]\)). Then, \(P_{j}\) receives an output \(\vec {x}_{j}^{\prime (i-1)}\), where \(\sum _{j\in [T]}\vec {x}_{j}^{\prime (i-1)}=\pi _{i}(\sum _{j\in [T]}\vec {x_{j}}^{(i-1)})\), and treats \(\vec {x}_{j}^{\prime (i-1)}\) as his input vector during the next round. Finally, each party \(P_{j}\) obtains an additive share \(\vec {x}_{j}^{\prime (T-1)}\) of the shuffled sum \(\pi (\sum _{j \in [T]} \vec {x}_{j}^{(0)})\) with permutation \(\pi =\pi _{T} \circ \dots \circ \pi _{1}\). If we adopt the Permute+Share scheme proposed in [14], then our realization of \(\mathcal {F}_\textrm{mSS}^{T,m}\) requires \(O(T(T-1)m\log m)\) OTs in total.

Correctness. By the definition of \(\mathcal {F}_\textrm{mPS}^{T,m,i}\), the sum of all parties’ outputs equals \(\pi _{T}(\sum _{j\in [T]} \vec {x}_{j}^{(T-1)})=\pi _{T}(\pi _{T-1}(\sum _{j\in [T]} \vec {x}_{j}^{(T-2)}))=\dots =\pi (\sum _{j\in [T]} \vec {x}_{j}^{(0)})\).

Theorem 1

This protocol securely computes \(\mathcal {F}_\textrm{mSS}^{T,m}\) under a semi-honest adversary which may corrupt up to \(T-1\) parties, if \(\mathcal {F}_\textrm{mPS}^{T,m}\) is secure against semi-honest adversaries.

Proof

The views of corrupted parties (i.e., \(\mathcal {C}\)) consist of their inputs and views during T invocations of \(\mathcal {F}_\textrm{mPS}^{T,m}\). As for the first round, simulator \(\mathcal {S}\) chooses random vectors as corrupted parties’ outputs by the definition of \(\mathcal {F}_\textrm{mPS}^{T,m}\), then treats them as inputs into the next round. By following the above strategies for each round of T-party Permuta+Share and leveraging the simulator of subroutine functionality \(\mathcal {F}_\textrm{mPS}^{T,m}\) in turn, the view of \(\mathcal {C}\) during \(\mathcal {F}_\textrm{mSS}^{T,m}\) can be ideally simulated by \(\mathcal {S}\).

3.2 Oblivious Zero-Sum Check

We present the notion and construction of the new primitive of oblivious zero-sum check. It can help parties securely determine whether the sum of their inputs is 0 without revealing anything else. It can be employed in the last step of MPSI-CA to obtain the intersection cardinality of shuffled data.

Functionality (\(\boldsymbol{\mathcal {F}_\textrm{OZK}^{T,m}}\)). \(\mathcal {F}_\textrm{OZK}^{T,m}\) receives input additive shares \(\langle \vec {x}\rangle _{i},i\in [T]\) from all parties, then outputs a binary vector \(\vec {e}=(e_{1},\dots ,e_{m})\) to \(P_{1}\). If the k-th position of the sum of input vectors \(\vec {x}=\sum _{i=1}^{T}\langle \vec {x}\rangle _{i}\) equals 0, then \(e_{k}=1\); otherwise \(e_{k}=0\) (i.e., \(e_{k}=1\) only when \(x_{k}=0\)). That is to say, \(\mathcal {F}_\textrm{OZK}^{T,m}\) ensures that \(P_{1}\) can not get to know the value of \(x_{k}\) unless it is equal to 0.

Protocol. As presented in Protocol 3.2, \(\mathcal {F}_\textrm{OZK}^{T,m}\) can be realized using secret sharing mechanism. Since each party holds an additive share of secret \(\vec {x}\), parties can obtain their additive shares of the product \(\vec {\gamma } \cdot \vec {x}\) using Beaver multiplication triples, where \(\vec {\gamma }\) is a “negotiated” random value and notation \(\cdot \) denotes component-wise multiplication of two vectors. \(\vec {\gamma }\) is kept secret to everyone, since each party \(P_{i}\) only knows an additive share \(\langle \vec {\gamma }\rangle _{i}\) of \(\vec {\gamma }\). If \(x_{k}=0\), it is obvious that the k-th position of \(\vec {\gamma } \cdot \vec {x}\) equals 0 (i.e., \(\gamma _{k}x_{k}=0\)); if \(x_{k}\ne 0\), \(P_{1}\) can not infer anything about \(x_{k}\) from \(\gamma _{k}x_{k}\) due to the random value \(\gamma _{k}\).

Parties need to interact with each other in order to obtain their additive shares of the product \(\vec {\gamma }\cdot \vec {x}\). We note that \(\vec {\gamma }\cdot \vec {x}=\sum _{i,j\in [T]}\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{j}\) can be divided into \(\sum _{i\in [T]}\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{i}\) and \((T^{2}-T)/2\) components \(\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{j}+\langle \vec {\gamma }\rangle _{j}\langle \vec {x}\rangle _{i}\), where \(i<j \in [T]\). For each component \(\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{j}+\langle \vec {\gamma }\rangle _{j}\langle \vec {x}\rangle _{i}\), it is feasible for \(P_{i}\) and \(P_{j}\) to securely obtain their additive shares \(sh^{i,j}_{0}\) and \(sh^{i,j}_{1}\) using Beaver triples by following Protocol 3.2. The online pairwise share-based multiplication will be greatly accelerated by consuming the Beaver triples, which have already been prepared in the setup stage. Finally, \(P_{i}\) sends the sum of \(\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{i}\) and his \(T-1\) shares of \(\sum _{j\in [T] \backslash \{i\}}(\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{j}+\langle \vec {\gamma }\rangle _{j}\langle \vec {x}\rangle _{i})\) to \(P_{1}\). So that \(P_{1}\) can reconstruct \(\vec {\gamma } \cdot \vec {x}\). If the k-th position of \(\vec {\gamma } \cdot \vec {x}\) equals 0, \(P_{1}\) sets \(e_{k}\) to 1, otherwise \(e_{k}=0\).

Correctness. It can be verified that \(sh^{i,j}_{0}+sh^{i,j}_{1}=\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{j}+\langle \vec {\gamma }\rangle _{j}\langle \vec {x}\rangle _{i}\) based on the property of Beaver triples. Therefore, the sum of all parties’ shares equals \(\sum _{i\in [T]}\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{i}+\sum _{1\le i<j\le T}(\langle \vec {\gamma }\rangle _{i}\langle \vec {x}\rangle _{j}+\langle \vec {\gamma }\rangle _{j}\langle \vec {x}\rangle _{i})=\vec {\gamma } \cdot \vec {x}\).

Theorem 2

Protocol 3.2 securely computes \(\mathcal {F}_\textrm{OZK}^{T,m}\) under a semi-honest adversary which may corrupt up to \(T-1\) parties.

Proof

In the trivial case that \(P_{1} \notin \mathcal {C}\), the views of corrupted parties \(\mathcal {C}\) can be simulated by substituting all shares with random vectors. If \(P_{1} \in \mathcal {C}\), for those positions where \(e_{k}=0\), all generated and received shares of randomized \(\gamma _{k}x_{k}\) are indistinguishable from uniformly random values; for positions where \(e_{k}=1\), shares can be simulated by choosing random values that sum to zero.

4 MPSI-CA Protocol Under Arbitrary Collusion

In this section, we recall the functionality of MPSI-CA and propose a semi-honest secure MPSI-CA protocol under arbitrary collusion. First, we introduce a technique called element sharing to reduce the original n-party MPSI-CA to T-party MPSI-CA of T leaders. Then, a detailed description of our MPSI-CA protocol is presented.

Functionality (\(\boldsymbol{\mathcal {F}_\mathrm{MPSI-CA}}\)). MPSI-CA allows n parties with m items to learn the intersection cardinality of their private sets without revealing anything else.

High-Level Description. The fundamental idea of our MPSI-CA protocol is to let all clients share their PRF-encoded data sets to T leaders \(L_{i}, i\in [T]\), and then delegate leaders to complete the task of T-party PSI-CA. T is set to be \(t+1\), otherwise the T-party MPSI-CA computation will be vulnerable to collusion attack in the worst case that all leaders are corrupted. Then, \(L_{1}\) invokes OPPRFs with all secondary leaders. After that, all leaders treat their modified outputs \(\vec {t_{i}},i\in [T]\) as inputs to the following multi-party secret-shared shuffle and oblivious zero-sum check, so that \(L_{1}\) can obtain the intersection cardinality.

4.1 Element Sharing

Considering that the overhead of MPSI-CA protocol tends to increase with the number of parties, it is a natural idea to delegate only a small number of parties to engage in expensive interactive procedures by sharing other parties’ PRF-encoded data sets to them in the first step. This trick was first adopted by [15] and is called element sharing for short in this paper.

The functionality \(\mathcal {F}_\textrm{ElemSh}^{n,T,m}\) of element sharing is that: for an element x, if \(x\in IS\), then each leader \(L_{i},i\in [T]\) holds a random additive share \(q_{i}(x)\) of 0 corresponding to x. The detailed process is shown in Sub-protocol 5.1, its correctness is obvious because if \(x \in IS\), then each PRF key \(K_{i,j}\) is used twice by both client \(P_{j}\) and leader \(L_{i}\) on the same item x, so that the two PRF outputs cancel out each other and \(\sum _{i=1}^{T}q_{i}(x)=0\).

We show that Sub-protocol 5.1 can securely compute \(\mathcal {F}_\textrm{ElemSh}^{n,T,m}\) under a semi-honest adversary which may corrupt up to t parties \((t<n)\) by giving a sketch of how to simulate the views of corrupted parties in the ideal world. The ideal views of corrupted clients are easy to simulate since they receive no messages. For corrupted \(L_{i},i\in [2,T]\), his received PRF keys can be simulated using random values. For the corrupted \(L_{1}\), the OKVS \(D_{j}\) (from an honest party \(P_{j}\)) appears random to him, since all the values encoded in \(D_{j}\) are encrypted using \(P_{j}\)’s \(T-1\) PRF keys. Therefore, \(\mathcal {S}\) can easily simulate the OKVS by generating an OKVS that encode m random key-value pairs, which is computationally indistinguishable from his real view by the obliviousness property of OKVS.

4.2 Detailed Description

As shown in Protocol 4.2, leader \(L_{i}\) utilizes the bucketing technique [12] to hash his elements into a hash table \(Table_{i}\) with b bins using simple hashing (or cuckoo hashing when \(i=1\)) with hash functions \(h_{1}, h_{2}, h_{3}\). For cuckoo hash table \(Table_{1}\), each element \(x\in X_{1}\) will be inserted into only one bin, say \(Table_{1}[h_{u}(x)]=x\) for some \(u\in [3]\). Finally, each empty bin will be padded with a dummy element. As for the simple hash table \(Table_{i},i\in [2,T]\), each \(x\in X_{i}\) will be inserted into three bins \(Table_{i}[h_{1}(x)]\), \(Table_{i}[h_{2}(x)]\) and \(Table_{i}[h_{3}(x)]\). When the number of hash functions is 3, the stash size can be reduced to 0 by setting \(b=1.28m\) while achieving a hashing failure probability of \(2^{-40}\) [16].

After invoking \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\) on b queries \(Table_{1}[k], k\in [b]\), if \(Table_{1}[k] \in IS\), then leaders hold additive shares of 0 (i.e., \(t_{k}=\sum _{i\in [T]}t_{i,k}=0\)). In order to obtain the number of k that satisfies \(t_{k}=0\) without revealing the index k, leaders invoke \(\mathcal {F}_\textrm{mSS}^{T,m}\) to obtain additive shares of shuffled \(t_{\pi (k)}\), where \(\pi \) is unknown to anyone. Then they engage in \(\mathcal {F}_\textrm{OZK}^{T,m}\) to securely aggregate and rerandomize the value of \(t_{\pi (k)}\). By the definition of \(\mathcal {F}_\textrm{OZK}^{T,m}\), the output \(\gamma _{k}t_{\pi (k)}\) equals 0 only when \(t_{\pi (k)}=0\), therefore \(L_{1}\) adds one to intersection cardinality |IS|.

Correctness. If element \(Table_{1}[k] \in IS\), then from the property of element sharing and OPPRF, we have \(\sum _{i\in [T]}q_{i}(Table_{1}[k])=0\) and \(r_{i,k}=q_{i}(Table_{1}[k])- t_{i,k}\), and thus \(t_{k}=0\). By the correctness of multi-party secret-shared shuffle and oblivious zero-sum check, \(L_{1}\) successfully reconstructs \(\gamma _{k}t_{\pi (k)}=0\), and knows there exists one more element that belongs to IS. Otherwise, if \(Table_{1}[k]\) does not belong to some \(X_{i}\) or \(S_{j}\), then either the OPPRF output \(r_{i,k}\) or the OKVS decode output \(q_{1}(Table_{1}[k])\) is a random value. Therefore, the probability that there exists an element \(Table_{1}[k] \notin IS\) s.t. \(\gamma _{k}t_{\pi (k)}=0\) is negligible.

Theorem 3

Protocol 4.2 securely computes \(\mathcal {F}_\mathrm{MPSI-CA}\) under a semi-honest adversary which may corrupt up to t parties \((t<n)\), if \(\mathcal {F}_\textrm{ElemSh}^{n,T,m}\), \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\), \(\mathcal {F}_\textrm{mSS}^{T,b}\) and \(\mathcal {F}_\textrm{OZK}^{T,b}\) are secure against semi-honest adversaries.

Proof

We divide the proof into three cases.

Case1: (\(L_{i}\notin \mathcal {C},i\in [T]\)). In this trival case, the views of corrupted parties (i.e., \(\mathcal {C}\)) can be easily simulated since they receive no messages.

Case2: (\(L_{1} \notin \mathcal {C}\)). In this case, \(\mathcal {C}\) receives no final output. The views of corrupted clients can be simulated in a way similar to Case 1. As for those corrupted \(L_{i},i\in [2,T]\), simulator \(\mathcal {S}\) first chooses a random key k to simulate \(L_{i}\)’s output of \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\) since \(\mathcal {C}\) only sees the senders’ views. Then, by the definition of \(\mathcal {F}_\textrm{mSS}^{T,b}\), \(\mathcal {S}\) chooses a random vector \(\vec {t^{\prime }_{i}}\) as his output of \(\mathcal {F}_\textrm{mSS}^{T,b}\), and leverages the simulators of subroutine functionalities \(\mathcal {F}_\textrm{ElemSh}^{n,T,m}\), \(\mathcal {F}_\textrm{mSS}^{T,b}\), \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\) and \(\mathcal {F}_\textrm{OZK}^{T,b}\) to simulate the view of corrupted \(L_{i}\). The view output by \(\mathcal {S}\) is indistinguishable from \(\mathcal {C}\)’s real view, which is obtained by the underlying simulators’ indistinguishability.

Case3: (\(L_{1} \in \mathcal {C}\)). In this case, \(\mathcal {C}\) receives |IS| as final output. \(\mathcal {S}\) can simulate \(\mathcal {C}\)’s view as follows. In step 2(b), it simulates \(L_{1}\)’s OPPRF outputs \(\vec {r_{i}},i\in [2,T]\) using uniformly random values while ensuring that: if \(L_{i}\in \mathcal {C}\), then \(r_{i,k}=q_{i}(Table_{1}[k])-t_{i,k}\) for every element \(Table_{1}[k]\) that belongs to \(X_{1}\cap X_{i}\), otherwise \(r_{i,k}\) and \(t_{i,k}\) are independent; if \(L_{i}\notin \mathcal {C}\), then \(L_{1}\)’s \(r_{i,k}\) is picked at random. In step 2(d), by the definition of \(\mathcal {F}_\textrm{mSS}^{T,b}\), it simulates corrupted parties’ outputs of \(\mathcal {F}_\textrm{mSS}^{T,b}\) using uniformly random vectors. In step 2(e), it simulates \(L_{1}\)’s output \(\vec {e}\) of \(\mathcal {F}_\textrm{OZK}^{T,b}\) by uniformly sampling a binary vector with |IS| ones due to the uniformly distributed permutation adopted in \(\mathcal {F}_\textrm{mSS}^{T,b}\). After that, \(\mathcal {S}\) can leverage the simulators of subroutine functionalities \(\mathcal {F}_\textrm{ElemSh}^{n,T,m}\), \(\mathcal {F}_\textrm{mSS}^{T,b}\), \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\) and \(\mathcal {F}_\textrm{OZK}^{T,b}\) to simulate the view of \(\mathcal {C}\). The view output by \(\mathcal {S}\) is indistinguishable from \(\mathcal {C}\)’s real view, which is obtained by the underlying simulators’ indistinguishability.

5 MPSI-CA-Sum Protocol Under Arbitrary Collusion

In this section, we first introduce a technique called payload sharing to share the payloads of clients to leaders. Then we smoothly extend Protocol 4.2 to provide a practical MPSI-CA-sum protocol that is secure under arbitrary collusion.

Functionality (\(\boldsymbol{\mathcal {F}_\mathrm{MPSI-CA-sum}}\)). To the best of our knowledge, we are the first to formalize the notion of MPSI-CA-sum. The functionality of MPSI-CA-sum is a generalization of the two-party PSI-CA-sum proposed in [10], with some modifications as to the number of parties that hold the payloads. The associated payload of element x is denoted as \(v_{i}(x)\) at leader \(L_{i}\)’s side and \(w_{j}(x)\) at client \(P_{j}\)’s side, respectively. The purpose of MPSI-CA-sum is to securely output the |IS| and intersection-sum \(Sum_{IS}\), which is shown in Functionality 6.

High-Level Description. The procedures of our MPSI-CA-sum protocol are similar to those of Protocol 4.2. Parties perform payload sharing, OPPRF and shuffle on their associated payloads of each element, and run Protocol 4.2 in parallel to obtain a binary vector \(\vec {e}\), which shows the shuffled indices of elements that belong to IS. As for those shuffled elements that belong to IS, \(L_{1}\) invokes OTs with all other leaders using choice string \(\vec {e}\), in order to aggregate the sum of their associated payloads (i.e., intersection-sum).

5.1 Payload Sharing

Sub-protocol 5.1 presents the steps of payload sharing, which aims to share the payloads of clients to T leaders. The functionality \(\mathcal {F}_\textrm{PaySh}^{n,T,m}\) of payload sharing is that: for an element x, if \(x\in IS\), then each leader \(L_{i},i\in [T]\) holds an additive share \(\widehat{v}_{i}(x)\) of the sum of payloads corresponding to x (i.e., \(\sum _{i=1}^{T}v_{i}(x)+\sum _{j=1}^{n-T}w_{j}(x)\)). The procedures of payload sharing are similar to those of element sharing. The correctness of Sub-protocol 5.1 relies on the correctness of (T, T) additive secret sharing scheme and the property that the PRF output of input x with a fixed PRF key \(K^{\prime }_{i,j}\) is deterministic.

We show that Sub-protocol 5.1 can securely compute \(\mathcal {F}_\textrm{PaySh}^{n,T,m}\) under a semi-honest adversary which may corrupt up to t parties \((t<n)\) by briefly simulating the view of \(\mathcal {C}\). The views of corrupted clients can be easily simulated since they receive no messages from the others. For corrupted \(L_{i},i\in [T]\), his received PRF key and OKVS from an honest party can be simulated using a random value and an OKVS that encodes m random key-value pairs, which are computationally indistinguishable from the real view by the obliviousness property of OKVS.

5.2 Detailed Description

The MPSI-CA-sum protocol under arbitrary collusion is presented in Protocol 5.2. Step 3 and step 4 can be executed in parallel by concatenating each element with its associated payload to avoid the cost of repeatedly invoking \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\) and \(\mathcal {F}_\textrm{mSS}^{T,b}\). But note that there is no need to perform \(\mathcal {F}_\textrm{OZK}^{T,b}\) on additive shares of the shuffled sum of payloads \(\pi (\vec {g})\) (i.e., \(\vec {g^{\prime }_{i}}\), \(i\in [T]\)). The hash table \(Table_{i},i\in [T]\) used in step 4 is generated in step 3 by following step 2(a) of Protocol 4.2.

After invoking \(\mathcal {F}_\textrm{OZK}^{T,b}\) during step 3, leader \(L_{1}\) outputs a vector \(\vec {e}=(e_{1},\dots ,e_{b})\). If \(e_{k}=1\), it means that the element in the \(\pi ^{-1}(k)\)-th bin of \(Table_{1}\) belongs to the intersection IS. Although \(L_{1}\) can not infer the original index of this element (i.e., \(\pi ^{-1}(k)\)) from k, he knows the existence of such an element. Therefore, he can still aggregate its associated payloads by invoking b OTs with each secondary leader \(L_{i}, i\in [2,T]\). In the k-th OT with \(L_{i}\), \(L_{1}\) acts as a receiver with choice bit \(e_{k}\), \(L_{i}\) acts as a sender who provides two strings \((mk_{i,k},mk_{i,k}+g^{\prime }_{i,k})\), where the random masks \(mk_{i,k},i\in [2,T],k\in [b]\) satisfy \(\sum _{i=2}^{T}\sum _{k=1}^{b}mk_{i,k}=0\). Those masks can be generated through additive secret sharing within secondary leaders. First, each secondary leader \(L_{i},i\in [2,T]\) locally generates a random vector \(\vec {mk^{\prime }_{i}}=(mk^{\prime }_{i,1},\dots , mk^{\prime }_{i,b})\) that ensures \(\sum _{k=1}^{b}mk^{\prime }_{i,k}=0\). Then, \(L_{i},i\in [2,T]\) performs \((T-1,T-1)\) additive secret sharing on vector \(\vec {mk}^{\prime }_{i}\), and sends \(T-2\) shares to other secondary leaders. Finally, \(L_{i},i\in [2,T]\) sums all his received shares and his local share together to obtain the new random mask vector \(\vec {mk}_{i}\).

Correctness. Since the correctness of |IS| (obtained in step 3) has already been proven in Section 4.2, here we only prove the correctness of \(Sum_{IS}\). If \(e_{k}=1\), then we have \(Table_{1}[\pi ^{-1}(k)]\in IS\) and \(\sum _{i=1}^{T}g_{i,k}^{\prime }=\sum _{i=1}^{T}\widehat{v}_{i}(Table_{1}[\pi ^{-1}(k)])\). After invoking OTs with each secondary leader, \(L_{1}\) adds the \(b(T-1)\) OT outputs together to obtain \(\sum _{i=2}^{T}\sum _{k=1}^{b}mk_{i,k}+\sum _{e_{k}=1, k \in [b]}\sum _{i=2}^{T}g^{\prime }_{i,k}=\sum _{e_{k}=1, k \in [b]}\sum _{i=2}^{T}g^{\prime }_{i,k}\). Therefore, we have \(\sum _{e_{k}=1, k \in [b]}\sum _{i=2}^{T}g^{\prime }_{i,k}+\sum _{e_{k}=1, k \in [b]}g^{\prime }_{1,k}=\sum _{x\in IS}\sum _{i=1}^{T}\widehat{v}_{i}(x)=\sum _{x\in IS}\left( \sum _{i=1}^{T}v_{i}(x)+\sum _{j=1}^{n-T}w_{j}(x)\right) =Sum_{IS}\).

Theorem 4

Protocol 5.2 securely computes \(\mathcal {F}_\mathrm{MPSI-CA-sum}\) under a semi-honest adversary which may corrupt up to t parties \((t<n)\), if \(\mathcal {F}_\textrm{ElemSh}^{n,T,m}\), \(\mathcal {F}_\textrm{PaySh}^{n,T,m}\) \(\mathcal {F}_\textrm{opprf}^{F,3m,b}\), \(\mathcal {F}_\textrm{mSS}^{T,b}\) and \(\mathcal {F}_\textrm{OZK}^{T,b}\) and OT are secure against semi-honest adversaries.

Proof

(sketch) The view of \(\mathcal {C}\) during step 1–4 can be simulated by following similar strategies given in Theorem 3 of Protocol 4.2. Given |IS|, \(L_{1}\)’s input choice string \(\vec {e}\) can be simulated with a uniform binary vector with |IS| ones. Since \(\vec {mk_{i}}\) is a random mask, \(\mathcal {S}\) can simulate corrupted \(L_{i}\)’s OT inputs \((mk_{i,k},mk_{i,k}+g^{\prime }_{i,k})\) using random values, and simulate corrupted \(L_{1}\)’s OT outputs from honest parties using random values, while ensuring that all \(b(T-1)\) OT outputs sum to \(Sum_{IS}-\sum _{e_{k}=1,k \in b} g^{\prime }_{1,k}\). Then, \(\mathcal {C}\)’s view can be simulated by leveraging the simulator of underlying OT.

6 Experimental Evaluation

Since the operations of computing intersection-sum is similar to those of computing intersection cardinality, the computational complexity of our MPSI-CA-sum protocol is roughly double that of our MPSI-CA protocol. So in this section, we only focus on evaluating the performance of our MPSI-CA protocol.

Parameters and Settings. We set statistical security parameter \(\lambda =40\) and computational security parameter \(\kappa =128\). We run our experiments on a laptop with an Intel i7-12700H 2.30 GHz CPU, 28 GB RAM, and Ubuntu-20.04 system in LAN setting. We instantiate the OPPRF using the realization provided in [12]. In the setup stage, it takes every two parties about 32 s to generate \(2^{18}\) Beaver triples [3]. Each party adopts separated threads to communicate with others to ensure parallelism. Besides, we divide our protocol into offline and online phases in the experiment. The offline phase consists of all base OT operations in secret-shared shuffle, which can be carried out in advance because they are independent of the input sets. The online phase consists of all the remaining operations: element sharing, OPPRF, secret-shared shuffle (without base OT) and oblivious zero-sum check (without Beaver triples generation).

Running Time and Communication Cost of MPSI-CA (Protocol 4.2). Table 2 shows the running time of our MPSI-CA protocol in both online and offline phases, as well as its communication cost, which includes both sent and received messages.

Table 2. The total running time, total communication cost, and the running time of different steps in our MPSI-CA protocol (Protocol 4.2).

Full size table

We present the performance of clients and primary leader under three different corruption conditions, namely when \(t=1,n/2\) and \(n-1\). Assuming \(t=1\), it takes our MPSI-CA protocol only 27.174 s to compute the multi-party intersection cardinality of 15 parties, each with a large set size of \(2^{18}\). In the honest majority situation where \(t=n/2\), the running time of leaders increases with the number of parties participating in multi-party secret-shared shuffle, and thus the running time is linear in t. When \(n=15\) and \(m=2^{12}\), the total running time is about 4.576 s with the online phase taking only 0.926 s. In the most challenging dishonest majority setting where \(t=n-1\), parties are not allowed to share their sets to leaders for fear of collusion attack, therefore, the number of leaders has to be n. However, since most of the expensive operations of multi-party secret-shared shuffle can be shifted to an offline phase, the total online running time can be reduced to only one fourth of the original time.

With respect to the communication performance of different parties, the cost of client is nearly independent of n and t. Whereas the cost of primary leader not only depends on n, but is also linear in the number of leaders \(T=t+1\). Concretely, when the set size is large (i.e., \(m=2^{18}\)), our protocol takes roughly 7KB communication cost per item at each leader’s side when \(n=5, t=4\), which includes both sent and received messages. This cost increases to about 25 KB per item in the most challeging case that \(n=15\), \(t=14\) and \(m=2^{18}\).

Running Time of Different Steps in Protocol 4.2. Table 2 lists the running time of different steps in Protocol 4.2 when \(t=n-2\). As shown in the table, the two steps OPPRF and shuffle take a large percentage of the total running time. When n grows, the change in the running time of OPPRF is slight since each party adopts separated threads to communicate with others to ensure parallelism. In the case that \(m=2^{16}\), \(n=15\) and \(t=13\), it only takes 7.881 s to finish the online task of MPSI-CA computation with this simple optimization.

Comparison with Other Works. There are only two MPSI-CA schemes [2, 11] secure against arbitrary collusion in the semi-honest adversary model, but they only give theoretic analysis of performance without experimental results. Table 3 compares the performance of them and our MPSI-CA protocol (Protocol 4.2) in terms of computational and communication complexities.

As shown in Table 3, [2, 11] both rely on a large number of expensive public key operations, which is linear in the maximal set size \(m_{max}\) or even \(m_{max}^{2}\). Therefore, it is impractical for resource-limited devices with large data sets to carry out these protocols due to the massive computational overhead. Moreover, the efficiency of those schemes remains to be improved in the unbalanced data setting (i.e., the minimal set size \(m_{min}\ll m_{max}\)), or when the number of corrupted parties t only accounts for a small percentage of n.

By adopting lightweight primitives which do not require any public key operations besides a set of base OTs, the number of public key operations in our MPSI-CA protocols is independent of set size, which is significantly lower than that of [2, 11]. At the same time, clients only need to send their PRF-encoded data to leaders instead of participating in expensive cryptographic interactive protocols for themselves, so that the total computational complexity can be significantly reduced especially when t/n is small. Besides, all the OTs required in multi-party secret-shared shuffle can be carried out in an offline phase, thus further decreasing the online computational complexity of our MPSI-CA protocol.

Table 3. The computational and communication complexities of MPSI-CA schemes, where m is the average set size, \(m_{max}\) is the largest set size and \(m_{1}\) is \(L_{1}\)’s set size; k is the ratio of OKVS size to its encoded set size m. In our Protocol 4.2, most of the public key operations can be shifted to the offline phase.

Full size table

With respect to communication and round complexities, [2, 11] both involve O(n) rounds due to the operation of passing on randomized ciphertexts to the next party in a circle. Whereas we only need to perform T-party shuffle within \(T=t+1\) leaders, and the round complexity is O(t). Although utilizing expensive HE can save the communication cost during the stage of multi-party shuffle in [2], we reckon that the gap between [2] and our MPSI-CA scheme can be narrowed in an unbalanced setting, for we can designate the party with the smallest data set to be leader \(L_{1}\) to ensure that \(m_{1}\ll m< m_{max}\). In this case, the additional communication overhead brought by multi-party secret-shared shuffle and oblivious zero-sum check can be reduced, so that the communication performance of our MPSI-CA scheme is comparable to that of [2].

Change history

19 February 2023
A correction has been published.

References

Chandran, N., Dasgupta, N., Gupta, D., Obbattu, S.L.B., Sekar, S., Shah, A.: Efficient linear multiparty psi and extensions to circuit/quorum PSI. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pp. 1182–1204 (2021)
Google Scholar
Debnath, S.K., Stǎnicǎ, P., Kundu, N., Choudhury, T.: Secure and efficient multiparty private set intersection cardinality. Adv. Math. Commun. 15(2), 365 (2021)
Article MathSciNet MATH Google Scholar
Demmler, D., Schneider, T., Zohner, M.: Aby-a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)
Google Scholar
Dong, C., Chen, L., Wen, Z.: When private set intersection meets big data: an efficient and scalable protocol
Google Scholar
Egert, R., Fischlin, M., Gens, D., Jacob, S., Senker, M., Tillmanns, J.: Privately computing set-union and set-intersection cardinality via bloom filters. In: Foo, E., Stebila, D. (eds.) ACISP 2015. LNCS, vol. 9144, pp. 413–430. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19962-7_24
Chapter MATH Google Scholar
Evans, D., Kolesnikov, V., Rosulek, M., et al.: A pragmatic introduction to secure multi-party computation. Found. Trends® Priv. Secur. 2(2–3), 70–246 (2018)
Google Scholar
Garimella, G., Mohassel, P., Rosulek, M., Sadeghian, S., Singh, J.: Private set operations from oblivious switching. In: Garay, J.A. (ed.) PKC 2021. LNCS, vol. 12711, pp. 591–617. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75248-4_21
Chapter Google Scholar
Garimella, G., Pinkas, B., Rosulek, M., Trieu, N., Yanai, A.: Oblivious key-value stores and amplification for private set intersection. In: Malkin, T., Peikert, C. (eds.) CRYPTO 2021. LNCS, vol. 12826, pp. 395–425. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84245-1_14
Chapter Google Scholar
Ion, M., et al.: On deploying secure computing: private intersection-sum-with-cardinality. In: 2020 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 370–389. IEEE (2020)
Google Scholar
Ion, M., et al.: Private intersection-sum protocol with applications to attributing aggregate ad conversions. Cryptology ePrint Archive (2017)
Google Scholar
Kissner, L., Song, D.: Private and threshold set-intersection. Carnegie-mellon univ pittsburgh pa dept of computer science. Technical report (2004)
Google Scholar
Kolesnikov, V., Matania, N., Pinkas, B., Rosulek, M., Trieu, N.: Practical multi-party private set intersection from symmetric-key techniques. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1257–1272 (2017)
Google Scholar
Lv, S., et al.: Unbalanced private set intersection cardinality protocol with low communication cost. Futur. Gener. Comput. Syst. 102, 1054–1061 (2020)
Article Google Scholar
Mohassel, P., Sadeghian, S.: How to hide circuits in MPC an efficient framework for private function evaluation. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 557–574. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38348-9_33
Chapter MATH Google Scholar
Nevo, O., Trieu, N., Yanai, A.: Simple, fast malicious multiparty private set intersection. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pp. 1151–1165 (2021)
Google Scholar
Pinkas, B., Schneider, T., Zohner, M.: Scalable private set intersection based on OT extension. ACM Trans. Priv. Secur. (TOPS) 21(2), 1–35 (2018)
Article Google Scholar
Trieu, N., Yanai, A., Gao, J.: Multiparty private set intersection cardinality and its applications. Cryptology ePrint Archive (2022)
Google Scholar

Download references

Acknowledgement

We are very grateful to the reviewers for their valuable comments. This work was supported in part by the National Key Research and Development Project 2020YFA0712300.

Author information

Authors and Affiliations

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
You Chen, Ning Ding & Dawu Gu
Fudata Technology, Room 1003, T7, Lane 100, Pingjiaqiao Road, Shanghai, 200126, China
Yang Bian

Authors

You Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ning Ding
View author publications
You can also search for this author in PubMed Google Scholar
Dawu Gu
View author publications
You can also search for this author in PubMed Google Scholar
Yang Bian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ning Ding or Dawu Gu .

Editor information

Editors and Affiliations

Institute of Information Engineering, CAS, Beijing, China
Yi Deng
Columbia University, New York, NY, USA
Moti Yung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y., Ding, N., Gu, D., Bian, Y. (2023). Practical Multi-party Private Set Intersection Cardinality and Intersection-Sum Under Arbitrary Collusion. In: Deng, Y., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2022. Lecture Notes in Computer Science, vol 13837. Springer, Cham. https://doi.org/10.1007/978-3-031-26553-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-26553-2_9
Published: 19 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26552-5
Online ISBN: 978-3-031-26553-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Practical Multi-party Private Set Intersection Cardinality and Intersection-Sum Under Arbitrary Collusion

Abstract

Similar content being viewed by others

Scalable Multi-party Private Set Union from Multi-query Secret-Shared Private Membership Test

A Scalable Multiparty Private Set Intersection

Efficient and Collusion Resistant Multi-party Private Set Intersection Protocols for Large Participants and Small Sets Setting

Keywords

1 Introduction

1.1 State of the Art of MPSI-CA

1.2 State of the Art of Two-Party PSI-CA-Sum

1.3 Our Contributions

1.4 High-Level Description

1.5 Organizations

2 Preliminaries

Definition 1

Definition 2

3 Two New Primitives and Constructions

3.1 Multi-party Secret-Shared Shuffle

Theorem 1

Proof

3.2 Oblivious Zero-Sum Check

Theorem 2

Proof

4 MPSI-CA Protocol Under Arbitrary Collusion

4.1 Element Sharing

4.2 Detailed Description

Theorem 3

Proof

5 MPSI-CA-Sum Protocol Under Arbitrary Collusion

5.1 Payload Sharing

5.2 Detailed Description

Theorem 4

Proof

6 Experimental Evaluation

Change history

19 February 2023

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation