Keywords

1 Introduction

With the rapid development of computing devices and transmission mediums, distributed collaborative computation has become more and more popular, in which independent individuals/organizations could collaborate with each other to perform various computations on the union of data they hold such that they can achieve a comprehensive computation result.

Nevertheless, this collaborative computation paradigm also introduces several challenges, especially the data security and privacy concerns. For example, a company would like to evaluate the prospect of a project. To obtain an accurate result, the company might need the data of other institutions. Nevertheless, the other institutions may not want to disclose their, because their data might contain valuable business information and sensitive personal information, the disclosure of which will result in big losses or even violate some relevant law and regulation [1, 2]. To respond this embarrassing situation, various privacy-preserving approaches have been put forward. Since being introduced to privacy preserving collaborative data mining by Lindell and Pinkas [3], secure multi-party computation (SMC) [4, 5] is shown to be a useful instrument for preserving data privacy in collaborative computation. SMC enables two or more participants to implement the collaborative computation on their dataset without revealing the data of a participant to anybody else, including other participants. That is, SMC can currently support collaborative computation and privacy-preservation.

For two n-dimensional vectors \(\varvec{x}=(x_1, x_2, \cdots , x_n)\) and \(\varvec{y}=(y_1, y_2, \cdots , y_n)\), the scalar product of them is the sum \(\sum _{i=1}^nx_iy_i\), which is also called dot product. The scalar product computation is a common step in many applications, such as computing Euclidean distance [6], item similarity [7], or trust value [8]. While the vectors \(\varvec{x}\) and \(\varvec{y}\) are holden by two different parties, it is a challenging problem to compute the scalar product of them without violating the privacy of any data. As one of most significant SMC protocols, scalar product protocol (SPP) aims at completing the challenging privacy preserving scalar product computation, i.e., computing the dot product and currently keeping each input vector private to its owner throughout the computation. SPP has been widely used in various privacy-preserving collaborative computation [6, 913]. As being a fundamental role, an efficient SPP can boost the practical process of privacy preserving distributed collaborative computation.

Up to now, many schemes [1420] have been proposed to perform the privacy preserving scalar product computation. Du and Zhan presented two schemes in [14]: dot product protocol employing a commodity server and scalar product protocol using random invertible matrix. Nevertheless, the former one requiring a third party, i.e., the commodity server, which will bring about fully privacy disclosure once the third party colludes with any participant. The latter does not need the third party, but takes \(O(n^2)\) computation time which is not suitable for large-scale computation. Through algebraic transformation, another scalar product protocol was introduced in [15]. As yet, the scheme in [15] also needs \(O(n^2)\) time. In [17, 18], Zhu et al. discussed the relation of secure scalar product protocol and privacy preserving add to multiply protocol, but did not provide efficient solution for them. Based on the additively homomorphic encryption system, three solutions for securely computing dot product of private vectors are given in [16, 19, 21], respectively. As well known, homomorphic encryption system is quite expensive in real-world applications. Recently, a secret sharing-based scalar product protocol was presented by Shaneck and Kim [20]. Unfortunately, the solution also employs a third party, and while the third party colludes with a participant it will reveal the private data of the other participant. Lately, Zhu et al. [22, 23] proposed an efficient approach to securely compute the scalar product while the dimension n is even. The state-of-the-art scheme can achieve O(n) complexity without requiring any third party and public key encryption system.

In this paper, we investigate the fundamental and widely-used SPP. We observe that both computation cost and communication overheads of Zhu et al.’s scheme [22, 23] can be dramatically reduced while sacrificing no security. Then, we proposed a new solution to SPP. Comparing with the state-of-the-art SPP scheme (which is also the fastest existing one) in [22, 23], our proposed scheme requires less than half cost in computation and communication both, but keeps the same security. Generally, our main contributions in this paper are as follows.

  • We present a new approach to preform scalar product computation on two private vectors of independent participants and currently preserve the data privacy of each party. We can dramatically reduce the computation and communication cost by more than \(50\,\%\) while achieving the same security, compared to the sate-of-the-art one in [22, 23].

  • We take no extra communication overheads and little extra computation cost, comparing with computing the scalar product without any privacy-preservation. That is, we can attain almost optimal efficiency.

  • Through theoretical analysis and evaluation, we indicate the security, correctness, and efficiency of our proposed scheme.

The rest of the paper is organized as follows. Section 2 introduces our system model, and discusses the state-of-the-art scheme. Then, Sect. 3 proposes our new scheme. Section 4 evaluates our proposed scheme through theoretical analysis and simulation experiments. At last, Sect. 5 concludes the paper.

2 System Model and Preliminaries

2.1 System Model

We consider a distributed collaborative computation model consisting of two participants: Alice and Bob. Here, Alice privately holds a vector \(\varvec{x}=(x_1, x_2, \cdots , x_{n}) \) and Bob has the other private vector \(\varvec{y}=(y_1, y_2, \cdots , y_{n})\). In this paper, we assume n to be an even integer, i.e., we focus on the even-dimension SPP. Without loss of generality, suppose \(n=2*k\) where k is a positive integer. It should be pointed out that any even-dimension SPP can be transformed into a general SPP, through the hybrid method in [23].

The object of SPP is that Alice attains a private number u and Bob receives a confidential output v while the private vectors are not disclosed to the other participant or anybody else. Besides, the output numbers u and v should satisfy the following Eq. (1).

$$\begin{aligned} \varvec{x}\cdot \varvec{y}=u+v \end{aligned}$$
(1)

That is, \(u+v=\sum _{i=1}^{n}x_{i}y_{i}\).

2.2 Threat Model

Generally speaking, SMC has two assumptions for the participant behaviors: semi-honest model, and malicious model. A semi-honest participant is also called to be honest-but-curious. Under the semi-honest model, each participant is assumed to correctly follow the steps of SMC protocol, but may keep a record about what he legally received to find out as much other participants’ confidential information as possible. In contrast, a malicious participant might do anything in the collaborative computation. The work [4] has proved that any SMC protocol in semi-honest model can be transformed into a secure computation protocol in malicious model.

In this paper, we assume the participants to be semi-honest, i.e., they will exactly implement the protocol according to the specified steps. We also suppose the communication channels between the participants are secure and authenticated, which can be realized by conventional cryptography.

2.3 Discussion of the Sate-of-the-ART Scheme

Latley, Zhu et al. [22, 23] put forward an efficient SPP (called EDSPP) for even-dimension private vectors, the detailed steps of which are shown in Protocol 1. In Step 1 of the protocol, for each \(j=1\) to k, the participants will totally generate 4 random numbers, complete 12 additions (including subtractions) and 6 multiplications, and send 6 numbers. Step 2 of the protocol contains 2k additions. Therefore, EDSPP will generate 4k random numbers, require 14k additions and 6k multiplications, and send 6k numbers.

The work [22, 23] has shown that EDSPP will disclose \((x_{2j-1}+x_{2j})\) to Bob, and reveal \((y_{2j-1}-y_{2j})\) to Alice. Though the disclosed summation might reveal partial information about private input, the security still can be acceptable in some real-world applications shown in [22, 23]. In this paper, we will propose a new even-dimension SPP with much higher efficiency and the same security.

figure a

3 Our New Scheme

In this paper, we focus on securely computing the scalar product of two private even-dimension vectors. For simplicity of presentation, we will introduce our scheme by using two 2-dimensional vectors (i.e., \(n=2\)). Our complete solution for any 2k-dimensional vectors (\(n=2k,~k>0\)) will be presented in the last part of this section.

While \(n=2\), Alice holds \(\varvec{x}=(x_1,x_2)\) and Bob has \(\varvec{y}=(y_1,y_2)\). To compute u and v which meets \(u+v=\varvec{x}\cdot \varvec{y}=x_1y_1+x_2+y_2\), Alice and Bob can exchange one dimension with each other. Concretely, Alice sends \(x_2\) to Bob, and Bob shares \(y_1\) with Alice, then Alice can set \(u=x_1y_1\) and Bob can attain \(v=x_2y_2\). However, the simple interchange will violate the privacy of \(x_2\) and \(y_1\). Our secure scheme is achieved by improving the above simple approach.

We first transform the problem as follows. Let X be the \(1\times 2\) matrix \((x_1,x_2)\), and Y be the \(2\times 1\) matrix \((y_1,y_2)^T\). Then, \(\varvec{x}\cdot \varvec{y}=XY\).

Further, while M is a \(2\times 2\) invertible matrix, we have \(\varvec{x}\cdot \varvec{y}=XY=(XM)(M^{-1}Y)\). Let \(X'=XM\) and \(Y'=M^{-1}Y\). If Alice and Bob shares the first dimension of \(X'\) and the second dimension of \(Y'\) with each other respectively, they can attain u and v. Through selecting appropriate matrix M, we can also preserve the privacy of both participants.

Here, we set the \(2\times 2\) matrix

$$\begin{aligned} M=\begin{bmatrix} 1&~~1\\ 1&~~0 \end{bmatrix}, \end{aligned}$$

and correspondingly

$$\begin{aligned} M^{-1}=\begin{bmatrix} 0&~~1\\ 1&~-1 \end{bmatrix}. \end{aligned}$$

Hence, \(X'=XM=(x_1+x_2, ~x_1)\) and \(Y'=M^{-1}Y=(y_2,~y_1-y_2)^T\). Let Alice share \((x_1+x_2)\) with Bob, and Bob give \((y_1-y_2)\) to Alice. After that, Alice and Bob computes \(u=x_1(y_1-y_2)\) and \(v=(x_1+x_2)y_2\), respectively. Then, we have \(u+v=(XM)(M^{-1}Y)=\varvec{x}\cdot \varvec{y}\). That is, we can complete the scalar product computation with merely disclosing \((x_1+x_2)\) and \((y_1-y_2)\), which achieves the same security with the work in [22, 23]. More importantly, we require much less computation and communication cost.

Fig. 1.
figure 1

Our Efficient Even-Dimension Scalar Product Protocol (ESPP)

Our complete scheme, called Efficient Even-Dimension Scalar Product Protocol (ESPP), is formally described in Protocol 2. To vividly show our method, we also present our scheme in Fig. 1.

figure b

4 Evaluation

4.1 Correctness

We consider the correctness of our scheme as follows.

For each \(i=1\) to k in Protocol 2, we always have

$$\begin{aligned} x_{2i-1}\beta _i+\alpha _iy_{2i}=x_{2i-1}(y_{2i-1}-y_{2i})+(x_{2i-1}+x_{2i})y_{2i} \end{aligned}$$
$$\begin{aligned} =x_{2i-1}y_{2i-1}+x_{2i}y_{2i}. \end{aligned}$$

Thus, \(u+v=\sum _{i=1}^{k}(x_{2i-1}y_{2i-1}+x_{2i}y_{2i})\) in our Protocol 2.

That is,

$$\begin{aligned} u+v=\sum _{j=1}^{2k}x_{j}y_{j}=\varvec{x} \cdot \varvec{y}, \end{aligned}$$

which completes our proof.

4.2 Security

It is easy to see that our scheme discloses nothing but \((x_{2i-1}+x_{2i})\) and \((y_{2i-1}-y_{2i})\). Thus, our scheme can achieve the same security with the existing work in [22, 23]. The security has been analyzed by [22, 23] in detail, and therefore we do not provide more detail about the security here.

4.3 Efficiency

Our protocol requires 4 additions and 2 multiplications, and sends 2 numbers. Thus, we need 4k additions and 2k multiplications, and sends 2k numbers in total.

Assume the bit length of each number is \(\mathcal {B}\). In Table 1, we compare the cost of EDSPP in [22, 23], our scheme, and the scalar product computation without privacy-preservation. It shows that our scheme is much more efficient than EDSPP [22, 23], in both computation cost and communication overheads. Comparing with scalar product computation without privacy-preservation, our scheme introduces no extra cost apart from a few additions. Therefore, our scheme can achieve almost optimal efficiency for privacy-preserving distributed collaborative scalar product computation.

Table 1. Comparison of cost

5 Conclusion

In this paper, we proposed a new even-dimension scalar product protocol, ESPP. Our proposed scheme can attain the same security with the state-of-the-art solution, while dramatically reducing the computation cost and communication overheads. Additionally, our scheme introduces no extra cost apart from a few additions, comparing with scalar product computation without privacy-preservation. It indicates that our scheme can achieve almost optimal efficiency for privacy-preserving distributed collaborative scalar product computation.

For the future work, we will devote to the formally secure SPP with high efficiency.