1 Introduction

The area of secure multiparty computation (MPC) studies how to design protocols that allow for a number of parties to jointly perform computations on private inputs in such a way that each party learns a private output, but nothing else than that. In the last decade efficient MPC protocols have been developed that can be used in practical applications.

In this work we focus on secret-sharing based MPC protocols, which are among the most used in practice. In secret-sharing based MPC, the target computation is represented as an arithmetic circuit consisting of sum and multiplication gates over some algebraic ring; each party initially shares her input among the set of parties, and the protocol proceeds gate by gate, where at every gate a sharing of the output of the gate is created; in this manner eventually parties obtain shares of the output of the computation, which can then be reconstructed.

A common practice is to use the preprocessing model, where the computation is divided in two stages: a preprocessing phase, that is completely independent from the inputs and whose purpose is to distribute some correlated randomness among the parties; and an online phase, where the actual computation is performed with the help of the preprocessing data. This approach allows for pushing much of the complexity of the protocol into the preprocessing phase and having very efficient online computations in return.

Some secret sharing based MPC protocols obtain security against any static adversary which actively corrupts all but one of the parties in the computation, assuming that the adversary is computationally bounded. Since in the active setting corrupted parties can arbitrarily deviate from the protocol, some kind of mechanism is needed to detect such malicious behaviour, and one possibility is the use of information-theoretic MACs to authenticate the secret shared data, which is used in protocols such as BeDOZa [3] and SPDZ [14].

In SPDZ this works as follows: the computation to be performed is given by an arithmetic circuit over a large finite field \(\mathbb {F}\). There is a global key \(\alpha \in \mathbb {F}\) which is secret shared among the parties. Then for every value \(x\in \mathbb {F}\) in the computation, parties obtain not only additive shares for that value, but also for the product \(\alpha \cdot x\) which acts as a MAC for x. The idea is that if a set of corrupt parties change their shares and pretend that this value is \(x+e\), for some nonzero error e, then they would also need to guess the correction value \(\alpha \cdot e\) for the MAC, which amounts to guessing \(\alpha \) since \(\mathbb {F}\) is a field. In turn this happens with probability \(1/|\mathbb {F}|\) which is small when the field is large.

The problem is that over small fields the cheating success probability \(1/|\mathbb {F}|\) is large. While one can take a large enough extension field \(\mathbb {L}\) of \(\mathbb {F}\) (e.g. if \(\mathbb {F}=\mathbb {F}_2\), then \(\mathbb {L}\) could be the field of \(2^s\) elements) and embed the whole computation into \(\mathbb {L}\), this looks wasteful as communication is blown up by a factor of s.

An alternative was proposed in MiniMAC [15]. MiniMAC uses a batch authentication idea: if we are willing to simultaneously compute k instances of the same arithmetic circuit over a small field at once, we can bundle these computations together and see them as a computation of an arithmetic circuit over the ring \(\mathbb {F}^k\), where the sum and multiplication operations are considered coordinatewise. Note the same authentication technique as in SPDZ does not directly work over this ring (if \(|\mathbb {F}|\) is small): if we define the MAC of a data vector \(\varvec{\mathbf {x}}\) in \(\mathbb {F}^k\) to be \(\varvec{\mathbf {\alpha }}*\varvec{\mathbf {x}}\) where the key \(\varvec{\mathbf {\alpha }}\) is now also a vector in \(\mathbb {F}^k\) and \(*\) is the coordinatewise product, the adversary can introduce an error in a single coordinate with probability \(1/|\mathbb {F}|\). Instead, MiniMAC first encodes every vector \(\varvec{\mathbf {x}}\) as a larger vector \(C(\varvec{\mathbf {x}})\) by means of a linear error-correcting code C with large minimum distance d, and then defines the MAC as \(\varvec{\mathbf {\alpha }}*C(\varvec{\mathbf {x}})\). Now introducing an error requires to change at least d coordinates of \(C(\varvec{\mathbf {x}})\) and the MAC can be fooled with probability only \(1/|\mathbb {F}|^d\). However, when processing multiplication gates, the minimum distance \(d^{*}\) of the so-called Schur square code \(C^{*}\) also needs to be large. These requirements on the minimum distance of these two codes have an effect on the communication overhead of the protocol, because the larger d and \(d^*\) are, the worse the relation between the length of messages and the length of the encoding.

This same article shows how to adapt this technique for computing a single boolean “well-formed” circuit while retaining the efficiency advantages of the batch simultaneous computation of k circuits. The idea is that if the target boolean circuit is structured into layers of addition and multiplication gates, where each layer has a large number of gates and its inputs are outputs of previous layers, then we can organize them into blocks of k gates of the same type, which can be computed using the above method. We then need an additional step that directs each block of outputs of a layer into the right block of inputs of next layers; this uses some additional preprocessed random sharings, and some openings, which slightly increases the communication complexity of the protocol.

In this paper, we explore an alternative to the error-correcting codes approach from MiniMAC, using an idea recently introduced in the honest majority, information-theoretically secure setting [8]. The point is that we can embed the ring \(\mathbb {F}_q^k\) in some extension field of \(\mathbb {F}_q\) in such a way that we can make the operations of both algebraic structures, and in particular the products (in one case the coordinatewise product, in the other the product in the extension field), “somewhat compatible”: i.e., we map \(\mathbb {F}_q^k\) into a slightly larger field \(\mathbb {F}_{q^m}\) with some dedicated linear “embedding” map \(\phi \), that satisfies that for any two vectors \(\varvec{\mathbf {x}},\varvec{\mathbf {y}}\) in \(\mathbb {F}_q^k\) the field product \(\phi (\varvec{\mathbf {x}})\cdot \phi (\varvec{\mathbf {y}})\) contains all information about \(\varvec{\mathbf {x}}*\varvec{\mathbf {y}}\), in fact there exists a “recovery” linear map \(\psi \) such that \(\varvec{\mathbf {x}}*\varvec{\mathbf {y}} = \psi (\phi (\varvec{\mathbf {x}}) \cdot \phi (\varvec{\mathbf {y}}))\). The pair \((\phi ,\psi )\) is called a (km)-reverse multiplication friendly embedding (RMFE) and was introduced in [5, 8]. With such tool, [8] embeds k evaluations of a circuit over \(\mathbb {F}_{q}\) (i.e. an evaluation of an arithmetic circuit over \(\mathbb {F}_q^k\) with coordinatewise operations) into one evaluation of a related circuit over \(\mathbb {F}_{q^m}\), which is securely computed via an information-theoretically secure MPC protocol for arithmetic circuits over that larger field (more precisely the Beerliova-Hirt protocol [2]). The use of that MPC protocol over \(\mathbb {F}_{q^m}\) is not black-box, however, as there are a number of modifications that need to be done at multiplication and input gates, for which certain additional correlated information has to be created in the preprocessing phase. Note that the reason for introducing this technique was that Beerliova-Hirt uses Shamir secret sharing schemes and hyperinvertible matrices, two tools that are only available over large finite fields (larger than the number of parties in the protocol).

1.1 Our Contributions

In this paper we construct a new secure computation protocol in the dishonest majority setting that allows to compute several instances of a boolean circuit at an amortized cost.Footnote 1 We do this by combining the embedding techniques from [8] with the SPDZ methodology. As opposed to [8], where one of the points of the embedding was precisely to use Shamir secret sharing, in our construction vectors \(\mathbf {x}\in \mathbb {F}_{2}^k\) are still additively shared in \(\mathbb {F}_2^k\), and it is only the MACs which are constructed and shared in the field \(\mathbb {F}_{2^m}\): the MAC of \(\mathbf {x}\) will be \(\alpha \cdot \phi (\mathbf {x})\) where \(\phi \) is the embedding map from the RMFE. Only when processing a multiplication gate, authenticated sharings where the data are shared as elements in \(\mathbb {F}_{2^m}\) are temporarily used. MACs are checked in a batched fashion at the output gate, at which point the protocol aborts if discrepancies are found.

By this method we obtain a very efficient online phase where processing multiplication gates need each party to communicate around 10 bitsFootnote 2 per evaluation of the circuit, for statistical security parameters like \(s=64, 128\) (meaning the adversary can successfully cheat with probability at most \(2^{-s}\), for which in our protocols we need to set \(m\ge s\)).

Our protocol can also be adapted to evaluating a single instance of a boolean circuit by quite directly adapting the ideas in MiniMAC that we mentioned above, based on organizing the circuit in layers, partitioning the layers in blocks of gates and adding some preprocessing that allows to map each block into the appropriate one in the next layer. The reason is that the maps used between layers of gates are \(\mathbb {F}_2\)-linear, and essentially all we need to use is the \(\mathbb {F}_2\)-linearity of the map \(\phi \) from the RMFE. The actual complexity added by this transformation is quite dependent on the topology of the circuit. Under some general assumptions one can expect to add 2 bits of communication per gate.

Our online phase follows a similar pattern to MiniMAC in the sense that, up to the output phase, every partial opening of a value in \(\mathbb {F}_2^k\) takes place when a partial opening of a C-encoding occurs in MiniMAC. Respectively, we need to open values in \(\mathbb {F}_{2^m}\) whenever MiniMAC opens \(C^*\)-encodings. At every multiplication gate, both protocols need to apply “re-encoding functions” to convert encodings back to the base authentication scheme, which requires a preprocessed pair of authenticated sharings of random correlated elements.

However, the encoding via RMFE we are using is more compact than the one in MiniMAC; the comparison boils down to comparing the “expansion factor” m/k of RMFEs with the ratio \(k^*/k\) between the dimensions of \(C^*\) and C for the best binary codes with good distances of \(C^*\) [7]. We cut the communication cost of multiplication gates by about half with respect to MiniMAC where those binary codes are used. We achieve even better savings in the case of the output gates since in this case MiniMAC needs to communicate full vectors of the same length as the code, while the input and addition gates have the same cost.

We also compare the results with a modified version of MiniMAC proposed by Damgård, Lauritsen and Toft [13], that allows to save communication cost of multiplication gates, by essentially using MiniMAC over the field of 256 elements, at the cost of a much larger amount of preprocessing that essentially provides authenticated sharings of bit decompositions of the \(\mathbb {F}_{256}\)-coordinates of the elements in a triple, so that parties can compute bitwise operations. This version achieves a communication complexity that is around \(80\%\) of that of our protocol, due to the fact that this construction can make use of Reed-Solomon codes. However, it requires to have created authenticated sharings of 19 elements, while ours need 5 and as far as we know there is no explicit preprocessing protocol that has been proposed for this version of MiniMAC.

Finally we compare the results with Committed MPC [16], a secret-sharing based protocol which uses (UC-secure) homomorphic commitments for authentication, rather than information-theoretical MACs. In particular, this protocol can also be used for boolean circuits, given that efficient constructions of homomorphic commitments [9, 10, 17] over \(\mathbb {F}_2\) have been proposed. These constructions of homomorphic commitments also use error-correcting codes. We find that, again, the smaller expansion m/k of RMFE compared to the relations between the parameters for binary error-correcting codes provides an improvement in the communication complexity of a factor \(\sim 3\) for security parameters \(s=64,128\).

We also provide a preprocessing phase producing all authenticated sharings of random correlated data that we need. The preprocessing follows the steps of MASCOT [19] (see also [18]) based on OT extension, with some modifications due to the slightly different authentication mechanisms we have and the different format of our preprocessing. All these modifications are easily to carry out based on the fact that \(\phi \) and \(\psi \) are linear maps over \(\mathbb {F}_2\). Nevertheless, using the “triple sacrificing steps” from MASCOT that assure that preprocessed triples are not malformed presents problems in our case for technical reasons. Instead, we use the techniques from Committed MPC [16] in that part of the triple generation.

1.2 Related Work

The use of information-theoretical MACs in secret-sharing based multiparty computation dates back to BeDOZa (Bendlin et al. [3]), where such MACs where established between every pair of players. Later SPDZ (Damgård et al. [14]) introduced the strategy consisting of a global MAC for every element of which every party has a share, and whose key is likewise shared among parties. Tiny OT (Nielsen et al. [21]), a 2-party protocol for binary circuits, introduced the idea of using OT extension in the preprocessing phase. Larraia et al. [20] extended these ideas to a multi-party protocol by using the SPDZ global shared MAC approach. MiniMAC (Damgård and Zakarias, [15]), as explained above, used error-correcting codes in order to authenticate vectors of bits, allowing for efficient parallel computation of several evaluations of the same binary circuits on possibly different inputs. Damgård et al. [13] proposed several improvements for the implementation of MiniMAC, among them the use of an error correcting code over an extension field, trading smaller communication complexity for a larger amount of preprocessing. Frederiksen et al. [18] gave new protocols for the construction of preprocessed multiplication triples in fields of characteristic two, based on OT extension, and in particular provided the first preprocessing phase for MiniMAC. MASCOT (Keller et al. [19]) built on some of these ideas to create preprocessing protocols for SPDZ based on OT extension. Committed MPC (Frederiksen et al. [16]) is a secret-sharing based secure computation protocol that relies on UC-secure homomorphic commitments instead of homomorphic MACs for authentication, but other than that, it follows a similar pattern to the protocols above. Efficient constructions of UC-secure homomorphic commitments from OT have been proposed by Frederiksen et al. [17] and Cascudo et al. [10] based on error correcting codes. Later, in [9] a modified construction from extractable commitments, still using error-correcting codes, was proposed that presents an important advantage for its use in Committed MPC, namely the commitment schemes are multi-verifier.

The notion of reverse multiplication friendly embedding was first explicitly defined and studied in the context of secure computation by Cascudo et al. in [8] and independently by Block et al. in [5]. The former work is in the context of information-theoretically secure protocols, while the latter studied 2-party protocols over small fields where the assumed resource is OLE over an extension field. This is partially based on a previous work also by Block et al. [4] where (asymptotically less efficient) constructions of RMFEs were implicitly used.

2 Preliminaries

Let \(\mathbb {F}_q\) denote a finite fields with q elements. Vectors are denoted with bold letters as \(\varvec{\mathbf {x}}=(x_1,x_2,\ldots ,x_n)\) and componentwise products of two vectors are denoted by \(\varvec{\mathbf {x}}*\varvec{\mathbf {y}}=(x_1\cdot y_1,x_2\cdot y_2,\ldots ,x_n\cdot y_n)\). Fixing an irreducible polynomial f of degree m in \(\mathbb {F}_q[X]\), elements in the field \(\mathbb {F}_{q^m}\) with \(q^m\) elements can be represented as polynomials in \(\mathbb {F}_{q}[X]\) with degree \(m-1\), i.e \(\alpha =\alpha _0+\alpha _1\cdot X+\cdots +\alpha _{m-1}\cdot X^{m-1}\in \mathbb {F}_{q^m}\), where \(\alpha _i\in \mathbb {F}_{q}\). The sums and products of elements are defined modulo f.

In our protocols we will assume a network of n parties who communicate by secure point-to-point channels, and an static adversary who can actively corrupt up to \(n-1\) of these parties. Our proofs will be in the universal composable security model [6].

We recall the notion of reverse multiplication friendly embeddings from [8].

Definition 1

Let \(k,m \in \mathbb {Z}^+\). A pair of \(\mathbb {F}_q\)-linear maps \((\phi , \psi )\), where \(\phi :\mathbb {F}_q^k \rightarrow \mathbb {F}_{q^m}\) and \(\psi :\mathbb {F}_{q^m} \rightarrow \mathbb {F}_q^k\) is called a \((k,m)_q\)-reverse multiplication friendly embedding (RMFE) if for all \(\varvec{\mathbf {x}}, \varvec{\mathbf {y}} \in \mathbb {F}_q^k\)

$$ \varvec{\mathbf {x}}*\varvec{\mathbf {y}} = \psi (\phi (\varvec{\mathbf {x}}) \cdot \phi (\varvec{\mathbf {y}})) $$

In other words, this tool allows to multiply coordinatewise two vectors over \(\mathbb {F}_q\) by first embedding them in a larger field with \(\phi \), multiplying the resulting images and mapping the result back to a vector over \(\mathbb {F}_q\) with the other map \(\psi \).

Several results about the existence of such pairs can be found in [8], both in the asymptotic and concrete settings. For our results we will only need the following construction, which can be obtained via simple interpolation techniques:

Theorem 1

([8]). For all \(r \le 33\), there exists a \((3r, 10r-5)_2\) -RMFE.

However, we remark that for implementations, it might be more useful to consider the following constructions of RMFEs which can also be deduced from the general framework in [8] (also based on polynomial interpolation). They have worse rate k/m than those in Theorem 1, but they have the advantage that their image can be in a field of degree a power of two, e.g. \(\mathbb {F}_{q^m}=\mathbb {F}_{2^{64}}\) or \(\mathbb {F}_{2^{128}}\).

Theorem 2

For any \(r\le 16\), there exists a \((2r,8r)_2\)-RMFE.Footnote 3

For our numerical comparisons we will mainly consider the constructions with better rate in Theorem 1 and point out that, should one want to use Theorem 2 instead, then some small overhead in communication is introduced.

It is important to understand some properties and limitations of the RMFEs. Because \(\phi \) and \(\psi \) are \(\mathbb {F}_{q}\)-linear then

$$\begin{aligned} \phi (\varvec{\mathbf {x}}+\varvec{\mathbf {y}})=\phi (\varvec{\mathbf {x}})+\phi (\varvec{\mathbf {y}}), \ \ \psi (x+y)=\psi (x)+\psi (y) \end{aligned}$$

holds for all \(\varvec{\mathbf {x}},\varvec{\mathbf {y}}\in \mathbb {F}_q^k\) and \(x,y\in \mathbb {F}_{q^m}\). However, for example

$$ \phi (\varvec{\mathbf {x}}*\varvec{\mathbf {y}})\ne \phi (\varvec{\mathbf {x}})\cdot \phi (\varvec{\mathbf {y}}) $$

in general. Likewise we will need to take into account that the composition \(\phi \circ \psi :\mathbb {F}_{q^m}\rightarrow \mathbb {F}_{q^m}\) is a linear map over \(\mathbb {F}_{q}\) but not over \(\mathbb {F}_{q^m}\). Therefore

$$\begin{aligned} (\phi \circ \psi )(x+y)=(\phi \circ \psi )(x)&+(\phi \circ \psi )(y) \ \ \text {for all} \,x,y\in \mathbb {F}_{q^m}\,,\text { but}\\ (\phi \circ \psi )(\alpha \cdot x)&\ne \alpha \cdot (\phi \circ \psi )(x) \end{aligned}$$

for \(\alpha , x \in \mathbb {F}_{q^m}\) in general (it does hold when \(\alpha \in \mathbb {F}_{q}\), but this is not too relevant).

These limitations on the algebra of \(\phi \) and \(\psi \) posed certain obstacles in the information-theoretical setting [8], since processing multiplication gates required to compute gates given by the map \(\phi \circ \psi \), and this cannot be treated as a simple linear gate over \(\mathbb {F}_{q^m}\). The additivity of \(\phi \circ \psi \) combined with certain involved preprocessing techniques saved the day there. For completion (and comparison to our paper) we sum up some of the main details of [8] in the full version of this paper [11]. In our case, we will again encounter problems caused by these limitations as we explain next, but can solve them in a different way.

3 The Online Phase

In this section we present our protocol for computing simultaneously k instances of a boolean circuit in parallel, which we can see as computing one instance of an arithmetic circuit over the ring \(\mathbb {F}_2^k\) of length k boolean vectors with coordinatewise sum and product.

Our strategy is to have mixed authenticated sharings: inputs and the rest of values in the computation \(\varvec{\mathbf {x}}\) are additively shared as vectors over \(\mathbb {F}_2^k\) (we refer to this as data shares), but their MACs are elements \(\alpha \cdot \phi (\varvec{\mathbf {x}})\) in the larger field \(\mathbb {F}_{2^m}\), where \(\alpha \in \mathbb {F}_{2^m}\) is (as in SPDZ) a global key that is additively shared among the parties from the beginning (with \(\alpha ^{(i)}\) denoting the share for party \(P_i\)), and parties hold additive shares of \(\alpha \cdot \phi (\varvec{\mathbf {x}})\) also in the field \(\mathbb {F}_{2^m}\) (the MAC shares). We will denote the authentication of \(\varvec{\mathbf {x}}\) by \(\langle \varvec{\mathbf {x}} \rangle \). That is

$$ \langle \varvec{\mathbf {x}}\rangle =\left( (\varvec{\mathbf {x}}^{(1)},\varvec{\mathbf {x}}^{(2)},\ldots ,\varvec{\mathbf {x}}^{(n)}),(m^{(1)}(\varvec{\mathbf {x}}),m^{(2)}(\varvec{\mathbf {x}}),\ldots ,m^{(n)}(\varvec{\mathbf {x}}))\right) $$

where each party \(P_i\) holds an additive share \(\varvec{\mathbf {x}}^{(i)}\in \mathbb {F}_2^k\) and a MAC share \(m^{(i)}(\varvec{\mathbf {x}})\in \mathbb {F}_{2^m}\), such that \(\sum _{i=1}^n m^{(i)}(\varvec{\mathbf {x}})=\alpha \cdot \sum _{i=1}^n \phi (\varvec{\mathbf {x}}^{(i)})=\alpha \cdot \phi (\varvec{\mathbf {x}}).\)

The additivity of \(\phi \) guarantees that additions can still be computed locally, and we can define \(\langle \varvec{\mathbf {x}}\rangle +\langle \varvec{\mathbf {y}}\rangle =\langle \varvec{\mathbf {x}}+\varvec{\mathbf {y}}\rangle \) where every party just adds up their shares for both values. Moreover, given a public vector \(\varvec{\mathbf {a}}\) and \(\langle \varvec{\mathbf {x}}\rangle \), parties can also locally compute an authenticated sharing of \(\varvec{\mathbf {a}}+\varvec{\mathbf {x}}\) as

$$\begin{aligned} \varvec{\mathbf {a}}+\langle \varvec{\mathbf {x}}\rangle =\Big ((\varvec{\mathbf {x}}^{(1)}+\varvec{\mathbf {a}},\varvec{\mathbf {x}}^{(2)},\ldots ,\varvec{\mathbf {x}}^{(n)}),(\alpha ^{(1)}\cdot \phi (\varvec{\mathbf {a}})+ m^{(1)}(\varvec{\mathbf {x}}),\ldots ,\alpha ^{(n)}\cdot \phi (\varvec{\mathbf {a}})+ m^{(n)}(\varvec{\mathbf {x}}))\Big ) \end{aligned}$$

This allows to easily process addition with constants. Moreover, this also allows us to explain how inputs are shared in the first place. In the preprocessing phase parties have created for each input gate an authenticated random values \(\langle \varvec{\mathbf {r}}\rangle \) where \(\varvec{\mathbf {r}}\) is known to the party that will provide the input \(\varvec{\mathbf {x}}\) at that gate. This party can just broadcast the difference \(\varvec{\mathbf {\epsilon }}=\varvec{\mathbf {x}}-\varvec{\mathbf {r}}\), and then parties simply add \(\varvec{\mathbf {\epsilon }}+\langle \varvec{\mathbf {r}}\rangle =\langle \varvec{\mathbf {x}}\rangle \) by the rule above.

As in SPDZ, parties in our protocol do not need to open any MAC until the output gate. At the output gate, the parties check MACs on random linear combinations of all values partially opened during the protocol, ensuring that parties have not cheated except with probability at most \(2^{-m}\) (we need that \(m\ge s\) if s is the statistical security parameter); then, they open the result of the computation and also check that the MAC of the result is correct.

A harder question, as usual, is how to process multiplication gates; given \(\langle \varvec{\mathbf {x}} \rangle \), \(\langle \varvec{\mathbf {y}} \rangle \) parties need to compute \(\langle \varvec{\mathbf {x}}* \varvec{\mathbf {y}} \rangle \) which implies not only obtaining an additive sharing of \(\varvec{\mathbf {x}}* \varvec{\mathbf {y}}\) but also of its MAC \( \alpha \cdot \phi (\varvec{\mathbf {x}}* \varvec{\mathbf {y}})\). If we try to apply directly the well-known Beaver’s technique [1] we encounter the following problem. Suppose we have obtained a random triple \(\langle \varvec{\mathbf {a}} \rangle , \langle \varvec{\mathbf {b}}\rangle , \langle \varvec{\mathbf {a}}*\varvec{\mathbf {b}}\rangle \) from the preprocessing phase and, proceeding as usual, parties partially open the values \(\varvec{\mathbf {\epsilon }}=\varvec{\mathbf {x}}-\varvec{\mathbf {a}}\), \(\varvec{\mathbf {\delta }}=\varvec{\mathbf {y}}-\varvec{\mathbf {b}}\) (a partially opening is an opening of the shares but not the MAC shares). From here, computing data shares for \(\varvec{\mathbf {x}}* \varvec{\mathbf {y}}\) is easy; however, the obstacle lies in computing shares of \( \alpha \cdot \phi (\varvec{\mathbf {x}}* \varvec{\mathbf {y}}).\) Indeed

$$\begin{aligned} \alpha \cdot \phi (\varvec{\mathbf {x}}*\varvec{\mathbf {y}})=\alpha \cdot \phi (\varvec{\mathbf {a}}*\varvec{\mathbf {b}})+ \alpha \cdot \phi (\varvec{\mathbf {a}}*\varvec{\mathbf {\delta }})+ \alpha \cdot \phi (\varvec{\mathbf {\epsilon }}*\varvec{\mathbf {b}})+ \alpha \cdot \phi (\varvec{\mathbf {\epsilon }}* \varvec{\mathbf {\delta }}), \end{aligned}$$

and the two terms in the middle present a problem: for example for \(\alpha \cdot \phi (\varvec{\mathbf {a}}*\varvec{\mathbf {\delta }})\) we have by the properties of the RMFE

$$\begin{aligned} \alpha \cdot \phi (\varvec{\mathbf {a}}*\varvec{\mathbf {\delta }})=\alpha \cdot \phi (\psi (\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {\delta }})))=\alpha \cdot (\phi \circ \psi )(\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {\delta }})) \end{aligned}$$

However, \(\phi \circ \psi \) is only \(\mathbb {F}_2\)-linear, and not \(\mathbb {F}_{2^m}\)-linear, so we cannot just “take \(\alpha \) inside the argument” and use the additive sharing of \(\alpha \cdot \phi (\varvec{\mathbf {a}})\) given in \(\langle \varvec{\mathbf {a}}\rangle \) to compute a sharing of the expression above. Instead, we use a two-step process to compute multiplication gates, for which we need to introduce regular SPDZ sharings on elements \(x\in \mathbb {F}_{2^m}\). I.e. both x and its MAC \(\alpha \cdot x\) are additively shared in \(\mathbb {F}_{2^m}\). We denote these by [x], that is

$$\begin{aligned}{}[x]=\left( (x^{(1)},x^{(2)},\ldots ,x^{(n)}),(m^{(1)}(x),m^{(2)}(x),\ldots ,m^{(n)}(x))\right) , \end{aligned}$$

where \(P_i\) will hold \(x^{(i)}\) and \(m^{(i)}(x)\in \mathbb {F}_{2^m}\) with \(\sum _{i=1}^n m^{(i)}(x)=\alpha \cdot \sum _{i=1}^n x^{(i)}.\)

To carry out the multiplication we need to preprocess a triple \((\langle \varvec{\mathbf {a}} \rangle , \langle \varvec{\mathbf {b}}\rangle ,\langle \varvec{\mathbf {c}}\rangle )\) where \(\varvec{\mathbf {c}}=\varvec{\mathbf {a}}*\varvec{\mathbf {b}}\), and a pair of the form \((\langle \psi (r) \rangle ,[r])\) where r is a random element in \(\mathbb {F}_{2^m}\). In the first step of the multiplication we compute and partially open

$$\begin{aligned}{}[\sigma ]=[\phi (\varvec{\mathbf {x}})\cdot \phi (\varvec{\mathbf {y}})-\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {b}})-r]. \end{aligned}$$
(1)

This can be computed from the \(\varvec{\mathbf {\epsilon }}\) and \(\varvec{\mathbf {\delta }}\) described above (details will be given later). In the second step, we create \(\langle \varvec{\mathbf {x}}*\varvec{\mathbf {y}}\rangle \) from (1) by using the properties of the RMFE; namely, \(\varvec{\mathbf {x}}*\varvec{\mathbf {y}}=\psi (\phi (\varvec{\mathbf {x}})\cdot \phi (\varvec{\mathbf {y}}))\) and \(\varvec{\mathbf {a}}*\varvec{\mathbf {b}}=\psi (\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {b}}))\), so applying \(\psi \) on \(\sigma \) in (1) yields \( \varvec{\mathbf {x}}* \varvec{\mathbf {y}}-\varvec{\mathbf {a}}*\varvec{\mathbf {b}}-\psi (r)\) because of the additivity of \(\psi \). Adding \(\langle \varvec{\mathbf {a}}*\varvec{\mathbf {b}} \rangle + \langle \psi (r)\rangle \) (the yet unused preprocessed elements) gives \(\langle \varvec{\mathbf {x}}*\varvec{\mathbf {y}}\rangle \).

We still need to explain how to construct \([\sigma ]\). For this we introduce some algebraic operations on the two types of authenticated sharings and public values. First given a public vector \(\varvec{\mathbf {a}}\) and a shared vector \(\varvec{\mathbf {x}}\) we define:

$$\begin{aligned} \varvec{\mathbf {a}}*\langle \varvec{\mathbf {x}}\rangle =\left( (\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {x}}^{(1)}),\ldots ,\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {x}}^{(n)})),(\phi (\varvec{\mathbf {a}})\cdot m^{(1)}(\varvec{\mathbf {x}}),\ldots ,\phi (\varvec{\mathbf {a}})\cdot m^{(n)}(\varvec{\mathbf {x}}))\right) \end{aligned}$$

Note that the data shares are shares of \(\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {x}})\), which is an element of \(\mathbb {F}_{2^m}\), and the MAC shares also correspond to additive shares of \(\alpha \cdot \phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {x}})\). However, the data shares are not distributed uniformly in \(\mathbb {F}_{2^m}\) because \(\phi \) is not surjective, so one cannot say this equals \([\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {x}})]\). Nevertheless, given another [z], with \(z\in \mathbb {F}_{2^m}\), it is true that \(\varvec{\mathbf {a}}*\langle \varvec{\mathbf {x}}\rangle +[z]=[\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {x}})+z]\) where the sum on the left is defined by just local addition of the data and MAC shares. We also define

$$\begin{aligned} \langle \varvec{\mathbf {x}}\rangle +[y]=\Big (&(\phi (\varvec{\mathbf {x}}^{(1)})+y^{(1)},\ldots ,\phi (\varvec{\mathbf {x}}^{(n)})+y^{(n)}),\\&(m^{(1)}(\varvec{\mathbf {x}})+m^{(1)}(y),\ldots ,m^{(n)}(\varvec{\mathbf {x}})+m^{(n)}(y))\Big ) =[\phi (\varvec{\mathbf {x}})+y] \end{aligned}$$

Now, given \(\langle \varvec{\mathbf {x}}\rangle , \langle \varvec{\mathbf {y}}\rangle \) and a triple \(\langle \varvec{\mathbf {a}} \rangle , \langle \varvec{\mathbf {b}}\rangle , \langle \varvec{\mathbf {a}}*\varvec{\mathbf {b}}\rangle \), parties can open \(\varvec{\mathbf {\epsilon }}=\varvec{\mathbf {x}}-\varvec{\mathbf {a}}\), \(\varvec{\mathbf {\delta }}=\varvec{\mathbf {y}}-\varvec{\mathbf {b}}\) and construct

$$\begin{aligned} \varvec{\mathbf {\epsilon }}* \langle \varvec{\mathbf {y}}\rangle +\varvec{\mathbf {\delta }}* \langle \varvec{\mathbf {x}}\rangle - \phi (\varvec{\mathbf {\epsilon }})\cdot \phi (\varvec{\mathbf {\delta }})-[r]&=[\phi (\varvec{\mathbf {\epsilon }})\cdot \phi (\varvec{\mathbf {y}})+\phi (\varvec{\mathbf {\delta }})\cdot \phi (\varvec{\mathbf {x}})- \phi (\varvec{\mathbf {\epsilon }})\cdot \phi (\varvec{\mathbf {\delta }})-r] \\ {}&=[\phi (\varvec{\mathbf {x}})\cdot \phi (\varvec{\mathbf {y}})-\phi (\varvec{\mathbf {a}})\cdot \phi (\varvec{\mathbf {b}})-r], \end{aligned}$$

where the latter equality can be seen by developing the expressions for \(\varvec{\mathbf {\epsilon }}\) and \(\varvec{\mathbf {\delta }}\), and using the additivity of \(\phi \). The obtained sharing is the \([\sigma ]\) we needed above. Summing up, the whole multiplication gate costs 2 openings of sharings of vectors in \(\mathbb {F}_2^k\) and one opening of a share of an element in \(\mathbb {F}_{2^m}\). Every multiplication gate requires fresh preprocessed correlated authenticated sharings \((\langle \varvec{\mathbf {a}} \rangle , \langle \varvec{\mathbf {b}}\rangle ,\langle \varvec{\mathbf {a}}*\varvec{\mathbf {b}}\rangle )\) and \((\langle \psi (r)\rangle ,[r])\) for random \(\varvec{\mathbf {a}},\varvec{\mathbf {b}},r\).

We present formally the online protocol we just explained, the functionality it implements, and the functionalities needed from preprocessing. The functionality constructing the required preprocessed randomness is given in Fig. 2, and relies on the authentication functionality in Fig. 1. The latter augments the one in MASCOT [19] allowing to also authenticate vectors and to compute linear combinations involving the two different types of authenticated values and which can be realized by means of the \([\cdot ]\)- and \(\langle \cdot \rangle \)-sharings.

Fig. 1.
figure 1

Functionality – authentication

Fig. 2.
figure 2

Functionality – preprocessing

The functionality for our MPC protocol is in Fig. 3 and the protocol implementing the online phase is in Fig. 4.

Theorem 3

\(\varPi _{\mathrm {Online}}\) securely implements \(\mathcal {F}_{\mathrm {MPC}}\) in the \(\mathcal {F}_{\mathrm {Prep}}\)-hybrid model.

Proof

The correctness follows from the explanation above. For more details we refer to the full version, but we also note that the online phase from this protocol is similar to the online phases of protocols such as [14,15,16, 19], except that in every multiplication we additionally need to use the pair \((\langle \psi (r)\rangle ,[r])\) in order to transform a \([\cdot ]\)-sharing into \(\langle \varvec{\mathbf {x}}*\varvec{\mathbf {y}}\rangle \). However, since r is uniformly random in the field \(\mathbb {F}_{2^m}\), the opened value \(\sigma \) masks any information on \(\varvec{\mathbf {x}}\), \(\varvec{\mathbf {y}}\).

Fig. 3.
figure 3

Functionality – MPC

Fig. 4.
figure 4

Online phase

3.1 Comparison with MiniMAC and Committed MPC

We compare the communication complexity of our online phase with that of MiniMAC [15] and Committed MPC [16], two secret-sharing based MPC protocols which are well-suited for simultaneously evaluating k instances of the same boolean circuit. We will count broadcasting a message of M bits as communicating \(M(n-1)\) bits (M bits to each other party). This can be achieved using point-to-point channels as described in the full version of [14].

Communication Complexity of Our Protocol. Partially opening a \(\langle \cdot \rangle \)-au-thenticated secret involves \(2k(n-1)\) bits of communication, since we have one selected party receive the share of each other party and broadcast the reconstructed value. Likewise, partially opening a \([\cdot ]\)-authenticated value communi-cates \(2m(n-1)\) bits. In our online phase, every input gate requires \(k(n-1)\) bits of communication. Multiplication gates require the partial opening of two \(\langle \cdot \rangle \)-authenticated values and one \([\cdot ]\)-authenticated value, hence \((4k+2m)(n-1)\) bits of communication. An output gate requires to do a MAC-check on (a linear combination of) previously partially opened values, then partially opening the output, and finally doing a MAC check on the output. A MAC check require every party to communicate a MAC share in \(\mathbb {F}_{2^m}\), for a total of mn bits communicated. Hence output gates require \(2k(n-1)+2mn\) bits of communication.

MiniMAC. MiniMAC uses a linear error correcting code C with parameters \([\ell , k, d]\) (i.e., it allows for encoding of messages from \(\mathbb {F}_2^k\) into \(\mathbb {F}_2^\ell \) and has minimum distance d). Parties have additive shares of encodings \(C(\varvec{\mathbf {x}})\), where the shares are also codewords, and shares of the MAC \(\varvec{\mathbf {\alpha }}*C(\varvec{\mathbf {x}})\), which can be arbitrary vectors in \(\mathbb {F}_2^{\ell }\). In addition, at multiplication gates \(C^*\)-encodings of information are needed, where \(C^{*}\) is the code \(C^{*}=\mathrm {span}\{\varvec{\mathbf {x}}*\varvec{\mathbf {y}}\mid \varvec{\mathbf {x}},\varvec{\mathbf {y}}\in C\}\), the smallest linear code containing the coordinatewise product of every pair of codewords in \(C^*\), with parameters \([\ell ,k^*,d^*]\). We always have \(d\ge d^*\), and the cheating success probability of the adversary in the protocol is \(2^{-d^*}\), so we need \(d^*\ge s\) for the statistical parameter s. The online phase of MiniMAC has a very similar communication pattern to ours: a multiplication requires to open two elements encoded with C (coming from the use of Beaver’s technique) and one encoded with \(C^{*}\). Since shares of C-(resp \(C^*\)-)encodings are codewords in C (resp \(C^*\)), and describing such codewords require k bits (resp. \(k^*\) bits)Footnote 4 the total communication complexity is \((4k+2k^*)(n-1)\), so the difference with our protocol depends on the difference between the achievable parameters for their \(k^*\) and our m, compared below. Input gates require \(k(n-1)\) bits, as in our case, and for output gates, since MAC shares are arbitrary vectors in \(\mathbb {F}_2^{\ell }\), a total of \(2k(n-1)+2\ell n\) bits are sent. See full version for more details on this.

Committed MPC. Committed MPC [16] is a secret-sharing based MPC protocol that relies on UC-secure additively homomorphic commitments for authentication, rather than on MACs. Efficient commitments of this type have been proposed in works such as [9, 10, 17] where the main ingredientFootnote 5 is again a linear error correcting code C with parameters \([\ell , k,d]\). In committed MPC, for every \(\varvec{\mathbf {x}}\in \mathbb {F}_2^k\), each party \(P_i\) holds an additive share \(\varvec{\mathbf {x}}_i\in \mathbb {F}_2^k\) to which she commits towards every other party \(P_j\) (in the multi-receiver commitment from [9], this can be accomplished by only one commitment). During most of the online phase there are only partial openings of values and only at output gates the commitments are checked. Multiplication is done through Beaver’s technique. In this case only two values \(\varvec{\mathbf {\epsilon }}\), \(\varvec{\mathbf {\delta }}\) are partially opened. In exchange, parties need to communicate in order to compute commitments to \(\varvec{\mathbf {\delta }}*\varvec{\mathbf {a}}\) (resp. \(\varvec{\mathbf {\epsilon }}*\varvec{\mathbf {b}}\)) given \(\varvec{\mathbf {\delta }}\), and commitments to \(\varvec{\mathbf {a}}\) (resp. \(\varvec{\mathbf {\epsilon }}\) and commitments to \(\varvec{\mathbf {b}}\)) at least with current constructions for UC-secure homomorphic commitments. [16, full version, Fig. 16] provides a protocol where each of these products with known constant vectors requires to communicate one full vector of length \(\ell \) and two vectors of \(k^*\) components (again \(\ell \) is the length of C and \(k^{*}\) is the dimension of \(C^{*}\)). In total the communication complexity of a multiplication is \((4k+2k^*+\ell )(n-1)\) bits. Output gates require to open all the commitments to the shares of the output. Since opening commitments in [9, 10, 17] requires to send two vectors of length \(\ell \) to every other party, which has a total complexity of \(2\ell (n-1)n\). Input gates have the same cost as the other two protocols.

Concrete Parameters. Summing up we compare the communication costs of multiplication and output gates in Table 1 since these are the gates where the communication differs.

Table 1. Total number of bits communicated in the different gates in the online phases, when computing k instances of a boolean circuit in parallel. Communication per party is obtained dividing by n.

The key quantities are the relation between m/k (in our case) and \(k^*/k\) and \(\ell /k\) in the other two protocols. While the possible parameters \(\ell , k, d\) of linear codes have been studied exhaustively in the theory of error-correcting codes, relations between those parameters and \(k^{*}\), \(d^{*}\) are much less studied, at least in the case of binary codes. As far as we know, the only concrete non-asymptotic results are given in [7, 12]. In particular, the parameters in Table 2 are achievable.

Table 2. Parameters for C and \(C^{*2}\) from [7].
Table 3. Parameters for RMFE from [8].

On the other hand, the parameters for our protocol depend on parameters achievable by RMFEs. By Theorem 1 for all \(1\le r\le 33\), there exists a RMFE with \(k=3r\) and \(m=10r-5\). Some specific values are shown in Table 3.

This leads to the communication complexities per computed instance of the boolean circuit for security parameters \(s=64\) and \(s=128\) given in Table 4. For larger security parameter, the comparison becomes more favourable to our technique, since the “expansion factor” m/k degrades less than the one for known constructions of squares of error correcting codes.

Table 4. Total number of bits sent per instance at multiplication and output gates

If instead we want to use Theorem 2, so that we can define the MACs over a field of degree a power of two, then the last column would have complexities \(12\cdot (n-1)\) and \(8\cdot n+2 (n-1)\) in both the cases \(s=64\) and \(s=128\).

Comparison with an Online Communication-Efficient Version of MiniMAC. In [13], a version of MiniMAC is proposed which uses linear codes over the extension field \(\mathbb {F}_{256}\). The larger field enables to use a Reed-Solomon code, for which \(k^*=2k-1\). However, because this only gives coordinatewise operations in \(\mathbb {F}_{256}^k\), the protocol needs to be modified in order to allow for bitwise operations instead. The modified version requires the opening of two \(C^*\)-encodings at every multiplication gate and a more complicated and much larger preprocessing, where in addition to creating certain type of multiplication triple, the preprocessing phase needs to provide authenticated sharings of 16 other vectors created from the bit decompositions of the coordinates of the two “factor” vectors in the triple. As far as we know, no preprocessing phase that creates these authenticated elements has been proposed.

The amortized communication complexity of that protocol is of \(8(n-1)\) bits per multiplication gate, per instance of the circuit, which is slightly less than \(80\%\) of ours. On the other hand, we estimate that the complexity of the preprocessing would be at least 4 times as that of our protocol and possibly larger, based on the number of preprocessed elements and their correlation.

Computation and Storage. In terms of storage, each authenticated share of a k-bit vector is \(m+k\) bits, which is slightly over 4 bits per data bit. MiniMAC and Committed MPC require a larger storage of \(\ell +k\) bits because the MAC shares/commitments are in \(\mathbb {F}_2^{\ell }\). In [13] shares are also 4 bits per data bit because of using RS codes, but the amount of preprocessed data is much larger. In terms of computation, while our protocol does slightly better for additions (again because of the shorter shares, and since the addition in \(\mathbb {F}_{2^m}\) is as in \(\mathbb {F}_2^{m}\)), and the same happens with additions required by multiplication gates, computing the terms \(\varvec{\mathbf {\epsilon }}* \langle \varvec{\mathbf {y}}\rangle \), \(\varvec{\mathbf {\delta }} * \langle \varvec{\mathbf {x}}\rangle \), \(\phi (\varvec{\mathbf {\epsilon }})\cdot \phi (\varvec{\mathbf {\delta }})\) requires in total 5 multiplications in \(\mathbb {F}_{2^m}\) which, being field multiplications, are more expensive than the coordinatewise ones required by MiniMAC, even if some of them are in a larger space \(\mathbb {F}_2^\ell \).

4 From Batch Computations to Single Circuit Computations

We explain now how to adapt our protocol, which was presented as a protocol for the simultaneous secure evaluation of k instances of the same boolean circuit, into a protocol that computes a single evaluation of a boolean circuit with little overhead, as long as the circuit is sufficiently “well-formed”. This is a quite straightforward adaptation of the ideas presented in [15]. The technique can be used in general for any boolean circuit but it works better when the circuit satisfies a number of features, which we can loosely sum up as follows:

  • The circuit is organized in layers, each layer consisting of the same type of gate (either additive or multiplicative). We number the layers in increasing order from the input layer (layer 0) to the output layer.

  • For most layers, the number of gates u is either a multiple of k or large enough so that the overhead caused by the need to add \(u'\) dummy gates to obtain a multiple of k and compute the gates in batches of k is negligible.

  • For most pairs of layers i and j, where \(i<j\), the number of output bits from layer i that are used as inputs in layer j is either 0 or sufficiently large so that we do not incur in much overhead by adding dummy outputs or inputs (again to achive blocks of size exactly k).

The idea from [15] is that given a layer of u gates, where we can assume \(u=t\cdot k\) we organize the inputs of the layers in t blocks of k gates, and we will compute each block by using the corresponding subroutine in our protocol.

For that we need to have authenticated shared blocks of inputs \(\langle \varvec{\mathbf {x}}\rangle \), \(\langle \varvec{\mathbf {y}}\rangle \) where the i-th coordinates \(x_i, y_i\) are the inputs of the i-th gate in the block. This assumes gates are of fan-in 2. For the case of addition gates, we can also support of course arbitrary fan-in gates, but then we want to have the same fan-in in every gate in the same block, again to avoid overheads where we need to introduce dummy 0 inputs. In any case at the end of the computation of this layer we obtain t authenticated sharings \(\langle \varvec{\mathbf {z}}\rangle \).

The question is how to now transition to another layer j. Let us assume that layer j takes inputs from l blocks \(\langle \varvec{\mathbf {x}}_1\rangle ,\ldots ,\langle \varvec{\mathbf {x}}_l\rangle \) of k bits each coming from some previous layer. Of course the issue is that we are not guaranteed that we can use these as input blocks for the layer j. We will likely need to reorganize the bits in blocks, we may need to use some of the bits more than once, and we may not need to use some of the bits of some output blocks. At first sight this reorganization may look challenging, because note that the bits of each \(\varvec{\mathbf {x}}_i\) can be “quite intertwined” in the MAC \(\alpha \cdot \phi (\varvec{\mathbf {x}}_i)\).

However in all generality, we can define \(l'\) functions \(F_1,\dots ,F_{l'}:\mathbb {F}_2^{kl}\rightarrow \mathbb {F}_2^k\) such that if we write \(\varvec{\mathbf {X}}=(\varvec{\mathbf {x}}_1,\varvec{\mathbf {x}}_2,\ldots ,\varvec{\mathbf {x}}_l)\) the concatenation of the output blocks, then \(F_1(\varvec{\mathbf {X}}),\dots ,F_{l'}(\varvec{\mathbf {X}})\) are the input blocks we need. These maps are \(\mathbb {F}_2\)-linear; in fact, each of the coordinates of each \(F_i\) are either a projection to one coordinate of the input or the 0-map. We assume that all these reorganizing functions can be obtained from the description of the function and therefore they are known and agreed upon by all parties.

Calling \(F=(F_1,F_2,\ldots ,F_{l'})\), suppose we can obtain by preprocessing

$$\begin{aligned} ((\langle \varvec{\mathbf {r}}_1 \rangle , \langle \varvec{\mathbf {r}}_2 \rangle , \ldots , \langle \varvec{\mathbf {r}}_l \rangle ),(\langle F_1(\varvec{\mathbf {R}})\rangle ,\langle F_2(\varvec{\mathbf {R}})\rangle ,\ldots ,\langle F_{l'}(\varvec{\mathbf {R}})\rangle ), \end{aligned}$$

where \(\varvec{\mathbf {R}}=(\varvec{\mathbf {r}}_1,\varvec{\mathbf {r}}_2,\ldots ,\varvec{\mathbf {r}}_l)\) is again the concatenation in \(\mathbb {F}_2^{kl}\). To ease the notation we will write \((\langle \varvec{\mathbf {R}}\rangle , \langle F(\varvec{\mathbf {R}})\rangle )\) and call this a reorganizing pair.

Then, reorganizing is done in the following way. The parties compute \(\langle \varvec{\mathbf {x}}_j\rangle -\langle \varvec{\mathbf {r}}_j\rangle \) and open these values for \(j=1,2,\ldots ,l\). Afterwards, they compute

$$\begin{aligned} F_j(\varvec{\mathbf {x}}_1-\varvec{\mathbf {r}}_1,\ldots ,\varvec{\mathbf {x}}_l-\varvec{\mathbf {r}}_l)+\langle F_j(\varvec{\mathbf {r}}_1,\ldots ,\varvec{\mathbf {r}}_l)\rangle =\langle F_j(\varvec{\mathbf {x}}_1,\ldots ,\varvec{\mathbf {x}}_l)\rangle \end{aligned}$$

which holds by the linearity of \(F_j\).

We can add this property to our setup above by including the supplements in Fig. 5 to \(\mathcal {F}_{\mathrm {Prep}}\), \(\mathcal {F}_{\mathrm {MPC}}\), and \(\varPi _{\mathrm {Online}}\). Apart from this we also need to point out that at the input layer, a party may need to add dummy inputs so that her input consists of a number of blocks of k bits.

Fig. 5.
figure 5

Reorganizing supplement

Of course, it looks as though we have moved the problem to the preprocessing phase, as we still need to construct the reorganizing random pairs \((\langle \varvec{\mathbf {R}}\rangle , \langle F(\varvec{\mathbf {R}})\rangle )\). But this will be easy because of the \(\mathbb {F}_2\)-linearity of the maps \(\phi \) and F.

The communication complexity of each reorganizing round is that of opening l vectors in \(\mathbb {F}_2^k\), therefore \(2 l k(n-1)\) bits of communication. Therefore, the efficiency of this technique clearly depends much on the topology of the circuit. For example if all the output bits of a given layer are used in the next layer and only there, then we can say that this technique adds roughly 2 bits of communication per party per gate.

5 Preprocessing

In this section, we present how to obtain the preprocessed correlated information we need in our online protocols. The implementation of authentication and construction of multiplication triples is adapted in a relatively straightforward way from MASCOT. This is because MASCOT is based on bit-OT extension, and working bit-by-bit is well suited for our situation because of the maps \(\phi , \psi \) being \(\mathbb {F}_2\)-linear. For the preprocessing of multiplication triples we do need to introduce some auxiliary protocols with respect to MASCOT: one is the preprocessing of reencoding pairs \((\langle \psi (r)\rangle ,[r])\) that we anyway need for the online protocol; another one creates [r] for a random r in the kernel of \(\psi \), which we need in order to avoid some information leakage in the sacrifice step. Both types of preprocessing can be easily constructed based on the \(\mathbb {F}_2\)-linearity of \(\psi \). Finally, we use the sacrifice step in Committed MPC, rather than the one in MASCOT, because of some technical issues regarding the fact that the image of \(\phi \) is not the entire \(\mathbb {F}_{2^m}\) which creates problems when opening certain sharings.

Fig. 6.
figure 6

Overview of dependency of the protocols needed for the preprocessing.

Aside from the aforementioned multiplication triples \((\langle \varvec{\mathbf {a}}\rangle , \langle \varvec{\mathbf {b}}\rangle , \langle \varvec{\mathbf {c}}\rangle )\) where \(\varvec{\mathbf {c}}=\varvec{\mathbf {a}}*\varvec{\mathbf {b}}\), for the online phase we also need to generate input pairs \((\varvec{\mathbf {r}},\langle \varvec{\mathbf {r}}\rangle )\), reencoding pairs of the form \((\langle \psi (r)\rangle ,[r])\), and (in case we want to use the techniques in Sect. 4) layer reorganizing pairs \((\langle \varvec{\mathbf {R}}\rangle ,\langle F(\varvec{\mathbf {R}})\rangle )\).

To obtain an overview of the way the functionalities presented in this section are dependent on each consider Fig. 6. We use the following basic ideal functionalities: parties can generate uniform random elements in a finite set using the functionality \(\mathcal {F}_{\mathrm {Rand}}\) (for the sake of notational simplicity we omit referring to \(\mathcal {F}_{\mathrm {Rand}}\) in protocols). Moreover, parties have access to a commitment functionality \(\mathcal {F}_{\mathrm {Comm}}\), see Fig. 7. We will also make use of a functionality \(\mathcal {F}_{\mathrm {ROT}}^{n,k}\) that implements n 1-out-of-2 oblivious transfers of k-bit strings (Fig. 8).

Fig. 7.
figure 7

Functionalities – randomness generation and commitment

Fig. 8.
figure 8

Functionality – random OT

We adapt the correlated oblivious product evaluation functionality \(\mathcal {F}_{\mathrm {COPEe}}\) defined in MASCOT [19]. We recall how this functionality works: we again see the field \(\mathbb {F}_{2^m}\) as \(\mathbb {F}_{2}[X]/(f)\) for some irreducible polynomial \(f\in \mathbb {F}_2[X]\). Then \(\{1,X,X^2,\dots ,X^{m-1}\}\) is a basis for \(\mathbb {F}_{2^m}\) as a \(\mathbb {F}_2\)-vector space. The functionality as described in [19] takes an input \(\alpha \in \mathbb {F}_{2^m}\) from one of the parties \(P_B\) in the initialization phase; then there is an arbitrary number of extend phases where on input \(x\in \mathbb {F}_{2^m}\) from \(P_A\), the functionality creates additive sharings of \(\alpha \cdot x\) for the two parties. However, if \(P_A\) is corrupted it may instead decide to input a vector of elements \((x_0,x_1,\dots , x_{m-1})\in (\mathbb {F}_{2^m})^m\), and in that case the functionality outputs a sharing of \(\sum _{i=0}^{m-1} x_i\cdot \alpha _i \cdot X^i\) (where \(\alpha _i\) are the coordinates of \(\alpha \) in the above basis). The honest case would correspond to all \(x_i\) being equal to x. This functionality from MASCOT corresponds to the steps Initialize and ExtendField in our version Fig. 10. We augment this by adding the step ExtendVector, where party \(P_A\) can input a vector \(\varvec{\mathbf {x}} \in \mathbb {F}_{2}^k\) and the functionality outputs an additive sharing of \(\alpha \cdot \phi (\varvec{\mathbf {x}})\in \mathbb {F}_{2^m}\). If party \(P_A\) is corrupted it may instead input \((\varvec{\mathbf {x}}_0,\varvec{\mathbf {x}}_1,\ldots ,\varvec{\mathbf {x}}_{m-1})\in (\mathbb {F}_2^k)^m\). In that case the functionality outputs an additive sharing of \(\sum _{i=0}^{m-1} \phi (\varvec{\mathbf {x}}_i)\cdot \alpha _i \cdot X^i\), and note that this is more restrictive for the corrupted adversary than ExtendField since the values \(\phi (\varvec{\mathbf {x}}_i)\) are not free in \(\mathbb {F}_{2^m}\) but confined to the image of \(\phi \). We define the functionality \(\mathcal {F}_{\mathrm {COPEe}}\) in Fig. 9 and present a protocol implementing the functionality in Fig. 10.

Fig. 9.
figure 9

Functionality – correlated oblivious product evaluation with errors.

Fig. 10.
figure 10

Correlated oblivious product evaluation with errors.

Proposition 1

\(\varPi _{\mathrm {COPEe}}\) securely implements \(\mathcal {F}_{\mathrm {COPEe}}\) in the \(\mathcal {F}_{OT}^{m,\lambda }\)-hybrid model.

Proof

The commands Initialize and ExtendField are as in [19] (the latter being called Extend there). The proof for our ExtendVector command is analogous to the one for the ExtendField except, as explained, because the ideal functionality restricts the choice by a corrupt \(P_A\) of the element that is secret shared. We briefly show the simulation of ExtendVector together with Initialize.

If \(P_B\) is corrupted, the simulator receives \((\alpha _0,\dots ,\alpha _{m-1})\) from the adversary, and simulates the initialization phase by sampling the seeds at random, and sending the corresponding one to the adversary. It simulates the ExtendVector phase by choosing \(\varvec{\mathbf {u}}_i\) uniformly at random in the corresponding domain, computes q as an honest \(P_B\) would do and inputs this to the functionality. Indistinguishability holds by the pseudorandomness of F, as shown in [19].

If \(P_A\) is corrupted then the simulator receives the seeds from the adversary in the Initialize phase, and from there it computes all the \(\varvec{\mathbf {t}}_b^i\) in the ExtendVector phase. Then when the adversary sends \(\varvec{\mathbf {u}}_i\), the simulators extract \(\varvec{\mathbf {x}}_i=\varvec{\mathbf {u}}_i-\varvec{\mathbf {t}}_0^i+\varvec{\mathbf {t}}_1^i\) and inputs \({t}=-\sum _{i=0}^{m-1}\phi (\varvec{\mathbf {t}}_0^i)\cdot X^i\) and \((\varvec{\mathbf {x}}_1,\varvec{\mathbf {x}}_2,\ldots ,\varvec{\mathbf {x}}_m)\) to \(\mathcal {F}_{\mathrm {COPEe}}\). In this case all outputs are computed as in the real world and indistinguishability follows.

5.1 Authentication

In protocol \(\varPi _{\mathrm {Auth}}\) (Figs. 11, 12, and 13), we use \(\mathcal {F}_{\mathrm {COPEe}}\) to implement \(\mathcal {F}_{\mathrm {Auth}}\).

Fig. 11.
figure 11

Authenticated shares – part 1.

Fig. 12.
figure 12

Authenticated shares – part 2.

Fig. 13.
figure 13

Authenticated shares – part 3.

Fig. 14.
figure 14

Creating input pairs.

In the initialize phase each pair of parties \((P_i, P_j)\) call the initialize phase from \(\mathcal {F}_{\mathrm {COPEe}}\) where \(P_i\) inputs a MAC key. Afterwards \(P_j\) can create authenticated sharings to the desired values, both of boolean vectors and of elements in the larger field: namely \(P_j\) constructs additive random sharings of the individual values and uses the appropriate extend phase of \(\mathcal {F}_{\mathrm {COPEe}}\) to obtain additive sharings of the MACs. At last, a random linear combination of the values chosen by \(P_j\) is checked. Here privacy is achieved by letting \(P_j\) include a dummy input \(x_{t+1}\) to mask the other inputs.

Proposition 2

\(\varPi _{\mathrm {Auth}}\) securely implements \(\mathcal {F}_{\mathrm {Auth}}\) in the

\((\mathcal {F}_{\mathrm {COPEe}},\mathcal {F}_{\mathrm {Rand}},\mathcal {F}_{\mathrm {Comm}})\)-hybrid model

Proof

Since the proof is similar to the proof of security for \(\varPi _{[[\cdot ]]}\) in [19], we point out the differences and argue why it does not have an impact on the security.

First of all note that our functionality, in contrary to \(\varPi _{[[\cdot ]]}\), has an Add command and a LinComb command. This is because we reserve the LinComb command for linear combinations which output \([\cdot ]\)-sharings, while Add outputs a \(\langle \cdot \rangle \)-sharing. In any case, the Add and LinComb command consist of local computations so it is trivial to argue their security. The Initialize command only invokes the Initialize command from the ideal functionality \(\mathcal {F}_{\mathrm {COPEe}}\), which is exactly the same as in [19]. Since the Open command lets the adversary choose what to open to there is not much to discuss here either.

Therefore, what we need to discuss is the Input and Check commands. The idea is that if the check in the input phase is passed and the adversary opens to incorrect values later on, then the probability to pass a check later on will be negligible. In comparison to [19], we have both values in \(\mathbb {F}_{2^m}\) and vectors in \(\mathbb {F}_2^k\), but we can still use the same arguments there, because the check in the Input phase and all further checks are in \(\mathbb {F}_{2^m}\) and therefore the simulation and indistinguishability is following by the exact same arguments as in [19].

5.2 Input, Reencoding, and Reorganizing Pairs

The two functionalities \(\mathcal {F}_{\mathrm {COPEe}}\) and \(\mathcal {F}_{\mathrm {Auth}}\) are the building blocks for the preprocessing. They are very similar in shape to the MASCOT functionalities but with some few corrections to include that sharings can be of vectors instead of field elements in \(\mathbb {F}_{2^m}\). With these building blocks we can produce the randomness needed for the online phase. First of all, we produce input pairs with protocol \(\varPi _{\mathrm {InputPair}}\) in Fig. 14. Proposition 3 is straightforward.

Proposition 3

\(\varPi _{\mathrm {InputPair}}\) securely implements \(\mathcal {F}_{\mathrm {Prep.InputPair}}\) in the \(\mathcal {F}_{\mathrm {Auth}}\)-hybrid model.

We also need to construct pairs to re-encode \([\cdot ]\)-sharings to \(\langle \cdot \rangle \)-sharings after a multiplication. A protocol \(\varPi _{\mathrm {ReEncodePair}}\) for producing the pairs \((\langle \psi (r)\rangle ,[r])\) for random \(r\in \mathbb {F}_{2^m}\) is shown in Fig. 15.

Proposition 4

\(\varPi _{\mathrm {ReEncodePair}}\) securely implements \(\mathcal {F}_{\mathrm {Prep.ReEncodePair}}\) in the

\((\mathcal {F}_{\mathrm {Auth}},\mathcal {F}_{\mathrm {Rand}})\)-hybrid model with statistical security parameter s.

Proof

First notice that at least one of the parties is honest and hence \({r}_j=\sum _{i=1}^n {r}_j^{(i)}\) is random because one of the terms is. Suppose that at the end of the Combine phase parties have created \((\langle \varvec{\mathbf {s}}_j\rangle ,[r_j])\), where possibly \(\varvec{\mathbf {s}}_j\ne \psi (r_j)\).

Let \(\varvec{\mathbf {\epsilon }}_j=\varvec{\mathbf {s}}_j- \psi (r_j)\) for all j. By \(\mathbb {F}_2\)-linearity of \(\psi \), \(\varvec{\mathbf {b}}_i-\psi (b_i)=\sum _{j=1}^{t+s} a_{ij}\varvec{\mathbf {\epsilon }}_j\). Hence if all \(\varvec{\mathbf {\epsilon }}_j=\varvec{\mathbf {0}}\), the check passes for all i. While if there is some \(\varvec{\mathbf {\epsilon }}_j\ne \varvec{\mathbf {0}}\), \(j=1,\dots ,t\), then for every i the probability that \(\sum _{j=1}^{t+s} a_{ij}\varvec{\mathbf {\epsilon }}_j=\varvec{\mathbf {0}}\) is at most 1/2.

Since the checks are independent we obtain that if some \(\varvec{\mathbf {\epsilon }}_j\ne \varvec{\mathbf {0}}\), \(j=1,\dots ,t\) then the protocol will abort except with probability at most \(2^{-s}\). Note also that \(b_i=r_{t+i}+\sum _{j=1}^t a_{ij} r_j\), so opening the \(b_i\) reveals no information about the output values \(r_1,\ldots , r_t\).

Fig. 15.
figure 15

Re-encode pairs.

Finally, a protocol for producing reorganizing pairs is given in Fig. 16.

Proposition 5

\(\varPi _{\mathrm {ReOrgPair}}\) securely implements \(\mathcal {F}_{\mathrm {Prep.ReOrgPair}}\) in the

\((\mathcal {F}_{\mathrm {Auth}},\mathcal {F}_{\mathrm {Rand}})\)-hybrid model with statistical security parameter s.

The proof of this proposition is similar to that of Proposition 4.

Fig. 16.
figure 16

Re-organize pairs.

Fig. 17.
figure 17

Construction of multiplication triples.

Fig. 18.
figure 18

Multiplication triples.

5.3 Multiplication Triples

Our protocol \(\varPi _{\mathrm {Triple}}\) for constructing triples is given in Figs. 17 and 18. We note that \(\varvec{\mathbf {c}}=\varvec{\mathbf {a}}*\varvec{\mathbf {b}}=\sum _{i,j}\varvec{\mathbf {a}}^{(i)}*\varvec{\mathbf {b}}^{(j)}\) and hence sharings of \(\varvec{\mathbf {c}}\) can be obtained by adding sharings of summands, where each of the summands only require two parties \(P_i\) and \(P_j\) to interact. Again, the construction step is much like the construction step from the protocol \(\varPi _{\mathrm {Triple}}\) in [19]. where we have modified the protocol such that it produces triples \((\langle \varvec{\mathbf {a}}\rangle ,\langle \varvec{\mathbf {b}}\rangle ,\langle \varvec{\mathbf {c}}\rangle )\) instead of ([a], [b], [c]).

However, after authentication, we use techniques from Committed MPC [16] to check correctness and avoid leakage on the produced triples. Indeed using the combine and sacrifice steps in MASCOT presents some problems in our case: in the sacrificing step in MASCOT parties take two triples ([a], [b], [c]) and \(([\hat{a}],[b],[\hat{c}])\) and start by opening a random combination \(s\cdot [a]-[\hat{a}]\) to some value \(\rho \), so that they can later verify that \(s\cdot [c]-[\hat{c}]-\rho \cdot [b]\) opens to 0. Since the second triple will be disregarded, and \(s\cdot a-\hat{a}\) completely masks a since \(\hat{a}\) is uniformly random, no information is revealed about a. In our case we would have triples \((\langle \varvec{\mathbf {a}}\rangle ,\langle \varvec{\mathbf {b}}\rangle ,\langle \varvec{\mathbf {c}}\rangle )\) and \((\langle \hat{\varvec{\mathbf {a}}}\rangle ,\langle \varvec{\mathbf {b}}\rangle ,\langle \varvec{\mathbf {\hat{c}}}\rangle )\) and sample a random \(s\in \mathbb {F}_{2^m}\), it would not be the case that \(\phi (\hat{\varvec{\mathbf {a}}})\) would act as a proper one-time pad for \(s\cdot \phi (\varvec{\mathbf {a}})\)Footnote 6. A similar problem would arise for adapting the combine step in [19].

Therefore, we proceed as in [16]: in the protocol \(\varPi _{\mathrm {Triple}}\) we start by constructing additive sharings of \(N=\tau _1+\tau _1\cdot \tau _2^2\cdot T\) triples. Then some of these triples are opened and it is checked that they are correct. This guarantees that most of the remaining triples are correct. The remaining triples are then organized in buckets and for each bucket all but one of the triples are sacrified in order to guarantee that the remaining triple is correct with very high probability. In order to be able to open proper sharings in the sacrifice step we need to add authenticated sharings of an element in the kernel of \(\psi \). We present a functionality serving that purpose in Fig. 19 and a protocol implementing it in Fig. 20.

Fig. 19.
figure 19

Functionality – authenticated random element in \(\ker (\psi )\).

Fig. 20.
figure 20

Authenticated random element in \(\ker (\psi )\).

Proposition 6

\(\varPi _{\mathrm {RanKer}}\) securely implements \(\mathcal {F}_{\mathrm {RanKer}}\) in the \((\mathcal {F}_{\mathrm {Auth}},\mathcal {F}_{\mathrm {Rand}})\)-hybrid model with statistical security parameter s.

The proof of this proposition is similar to that of Proposition 4. The correctness follows from the additivity of \(\psi \).

The sacrifice step opens the door for a selective failure attack, where the adversary can guess some information about the remaining triples from the fact that it has not aborted, so a final combining step is used to remove this leakage.

Proposition 7

\(\varPi _{\mathrm {Triple}}\) securely implements \(\mathcal {F}_{\mathrm {Prep.Triple}}\) in the

\((\mathcal {F}_{\mathrm {Auth}},\mathcal {F}_{\mathrm {ROT}}^{m,k},\mathcal {F}_{\mathrm {Rand}}, \mathcal {F}_{\mathrm {RanKer}})\)-hybrid model.

The proof uses similar arguments as in [16] and can be found in the full version.

Proposition 8

\(\varPi _{\mathrm {InputPair}}\), \(\varPi _{\mathrm {ReEncodePair}}\), and \(\varPi _{\mathrm {Triple}}\) securely implements \(\mathcal {F}_{\mathrm {Prep}}\) in the \((\mathcal {F}_{\mathrm {Auth}},\mathcal {F}_{\mathrm {ROT}}^{m,k},\mathcal {F}_{\mathrm {Rand}})\)-hybrid model.

Proof

This follows directly from Propositions 3, 4, and 7.

Complexity of Preprocessing

We briefly describe the communication complexity for producing the randomness needed for the online phase. Starting by considering the construction of an input pair the only communication we have to consider here is a single call to \(\mathcal {F}_{\mathrm {Auth.Input}}\). The main cost of authentication is the call to \(\varPi _{\mathrm {COPEe}}\) where the parties needs to send \(mk(n-1)\) bits for each vector authenticated. In the case where a field element is authenticated instead they need to send \(m^2(n-1)\) bits. Furthermore, the party who is authenticating needs to send the shares of the vector authenticating but this has only a cost of \(k(n-1)\) bits. At last, the check is carried out but we assume that the parties authenticate several vectors/values in a batch and hence this cost is amortized away.

For the re-encoding pairs we assume that t is much larger than s. This means that in order to obtain a single pair the parties need to authenticate n field elements and n vectors. Once again we assume that the check is amortized away, so this gives a total cost of sending \((m^2+mk)n(n-1)\) bits.

The same assumption, that t is much larger than s, is made for the reorganizing pairs and the random elements in the kernel of \(\psi \). This means that the amortized cost of producing a reorganizing pair is \((l+l')n\) vector-authentications and to obtain [r] for \(r\in \ker (\psi )\) costs n authentication amortized.

Regarding the communication for obtaining a single multiplication triple we ignore the vectors sent in the construction since the authentication is much more expensive. Besides authentication we make \(\tau _1\tau _2^2n(n-1)\) calls to \(\mathcal {F}_{\mathrm {ROT}}^{k,1}\). We authenticate \(3\tau _1\tau _2^2n\) vectors in the construction. Furthermore, we need \((\tau _2-1)\tau _2^2\) elements from \(\mathcal {F}_{\mathrm {RanKer}}\) and 2 reencoding pairs for the construction of the triple. The cost of the remaining steps is not close to be as costly, so we ignore these.

In [16] it is suggested to use \(\tau _1=\tau _2=3\). The cost of preparing a multiplication gate using these parameters is that of producing 3 reencoding pairs (2 for the preprocessing and 1 for the online phase), 18 authenticated elements in the kernel of \(\psi \) and the multiplication triple which yields 27 calls to \(\mathcal {F}_{\mathrm {ROT}}^{k,1}\) and \(3\cdot 27\) authentication of vectors. Thus using \(m=3.1k\) from Table 3 in order to obtain security \(s\ge 64\) and ignoring the calls to \(\mathcal {F}_{\mathrm {ROT}}^{k,1}\) the communication becomes

$$\begin{aligned}&3\cdot (3.1^2+3.1)k^2n(n-1)+18\cdot 3.1^2k^2n(n-1)+3\cdot 27 \cdot 3.1\cdot k^2n(n-1) \text { bits}\\ {}&=462.21\cdot k^2n(n-1)\text { bits}. \end{aligned}$$

Similarly, in order to obtain \(s\ge 128\) we use \(m=3.21k\) from Table 3 and the communication becomes \(486.03\cdot k^2n(n-1)\) bits.