Keywords

1 Introduction

Verifiable computation makes it possible for personal computers to perform intensive computations through outsourcing computation tasks to a powerful cloud. In the age of cloud, resources are becoming more centralized. Individuals lacking computational capacity need only to buy the corresponding service in the cloud instead of purchasing their own expensive equipments to perform computation tasks. In this way, not only individual performs its computations cheap, but also resources in the cloud shared by many individuals are made full use of. So this kind of service model is attracting increasing attention.

In this paper, we name those who want to outsource intensive tasks clients, and those who have powerful resources servers. As the server may be not honest, the returned result should be verified by client to avoid malicious behavior from dishonest server. The cost of verification must be cheaper than the cost of preforming the computation, otherwise this outsourcing task will make no sense. In many instances, other parties in the network except for the client want to use this computation result. It will be better if the returned result can be publicly verified by all parties. For example, a doctor asks a server to perform a computation on the data of his patient, nurses need the computation result for better nursing. If the nurses have the ability to verify the result, they can get access to the correct result even if the doctor is not online. Many cases like this.

1.1 Related Work

Verifiable computation has a large body of prior works. There are two branches, one is on general functions, the other is on specific functions. Researches on general functions often used the method of knowledge proofs to verify the correctness of the returned result [10, 11, 17, 18, 25]. Until Gentry et al. constructed a fully homomorphic encryption over idea lattices [13], several verifiable computation protocols on general functions using the fully homomorphic encryption appeared [1, 8, 16, 21]. A representative is Gennaro et al.’ research work [16]. The authors combined fully homomorphic encryption with Yao’s garbled circuit and used the range of the circuit to verify the result. Researches on specific functions utilized the special structure of the outsourcing function. So those researches often focused on polynomial and matrix computations [2, 4, 12, 26]. Of course, there were some other researches on linear algebra [23] and exponential operations [19]. Our verifiable computation protocol is on large polynomials which have a large scale of variables and are in high degree. This kind of polynomials has an extensive use in important statistics. We will introduce two notions most relevant with our work in the following, one is amortized verifiable computation, the other is public verification.

Amortized Verifiable Computation. This notion was proposed by Gennaro et al. [16], it was widely used in many of the later works about verifiable computation [2, 4, 7, 12, 26]. The client performs a pre-computation for a specific function, then the returned result from server can be verified by client in a cheap cost. Although the pre-computation cost may be as expensive as the cost of performing the outsourcing function, this function can be performed several times on different inputs by server. After several computations, this expensive pre-computation cost can be amortized.

Benabbas et al. [4] followed this amortized notion and they proposed a novel method on verifiable polynomial computations. They utilized pseudorandom functions which have closed-form efficiency to generate a series of numbers as new polynomial coefficients according to specific polynomial structure. Clients use the reconstructed polynomial to verify the correctness of the returned result. As the reconstructed polynomial can be efficiently computed, this protocol is efficient on the amortized notion. The randomness of their pseudorandom functions are based on decisional Diffie-Hellman assumption.

Backes et al. [2] proposed another brand new method on verifiable quadratic polynomial computations. They combined homomorphic MAC with verifiable computation and this is also a representative of amortized verifiable computation. Clients preform a pre-computation on the multi-variable, quadratic polynomial first, then the returned result from the server can be verified by computing a quadratic polynomials on two variables. The verification is pretty efficient. Though their protocol is on quadratic polynomials, their work is irradiative. The security of their homomorphic MAC is based on decision linear assumption.

Public Verification. Recently, two works on verifiable computation can be publicly verified. Parno et al. [24] used the primitive of attribute-based encryption. As we know, the ciphertext of an attribute-based encryption can be decrypted only if the attribute makes a function true. They used the result of an attribute under a one way function as public verification key. Any verifier performs this one-way function on the returned result to check the equality with the public verification key to verify the result. Their protocol was suitable for functions that can be expressed as poly-size Boolean Formulas. Another work is by Fiore et al. [12]. They followed the work of Benabbas et al. [4] and made the result publicly verifiable by combining it with a bilinear map. They used a pseudorandom function proposed by Lewko and Waters [22], and reduced the security of their verifiable computation protocol on co-computational Diffie-Hellman assumption.

In publicly verifiable computation protocols, the client performs an off-line pre-computation according to the function only, and then performs on-line pre-computations under inputs. The result of the on-line pre-computation should be public to allow a public verification. Expensive off-line pre-computation cost will be amortized to each on-line pre-computation if this function will be outsourced many time to server under different inputs.

1.2 Our Contribution

In this paper, we present a verifiable computation protocol on large polynomials. We call polynomials of a large scale of variables and in high degree large polynomials. This kind of polynomials has a significant use in statistics. We follow the idea of Backes et al. [2] and extend their protocol to a more generally applicable case. Their protocol is about verifiable computation on quadratic polynomials, while our protocol is on high degree polynomials and the result can be publicly verified. One challenge is that their protocol restricted in quadratic polynomial because their basic tool, homomorphic MAC, was constructed over a bilinear map. In a bilinear map setting, multiplication of exponents can be performed at most once. If we want to construct a verifiable computation protocol on high degree polynomials, the multilinear map is intuitive. Fortunately, Garg et al. [14] made a plausible lattices-based construction of multilinear map. Though this multilinear map is not efficient enough now, this cannot stop people from using it for new constructions [6, 15, 20, 26]. This multilinear map makes sense in our verifiable computation protocol as the pre-computation is performed on integers first, and then encodes the result to a group element in multilinear map. Another challenge is that the randomness of their pseudorandom function used for constructing homomorphic MAC is based on decision linear assumption which will no longer hold in a multilinear map setting. So we construct a new pseudorandom function based on subgroup decisional assumption to build our verifiable computation protocol. What’s more, this pseudorandom function has a better performance in reducing the pre-computation cost than the pseudorandom function used by Backes et al.. The last challenge is to realize public verification. We follow the idea of Fiore et al. [12] to make a publicly verifiable computation protocol. The security is based on co-computational Diffie-Hellman assumption.

Assume the outsourcing polynomial is of m variables and degree at most d in each monomial. The main features of our protocol are as follows:

  • Our protocol is a publicly verifiable computation protocol on large polynomials.

  • We follow the idea of amortized verifiable computation. The off-line pre-computation cost is \(O((m+1)^{d})\), same as the cost of performing the outsourcing polynomial computation. The on-line pre-computation cost is O(d) in addition with a multilinear map operation. After several computations on different inputs, off-line pre-computation cost can be amortized.

2 Preliminaries

Notation. If S is a set, \(x\mathop {\longleftarrow }\limits ^{U}S\) denotes uniformly choosing an element x from S. If \(\mathcal {A}\) is an algorithm, \(x\leftarrow \mathcal {A}(\cdot )\) denotes the process of running \(\mathcal {A}\) on some appropriate input and assigning its output to x. Let \(n\in \mathbb {N}\) be the security parameter, lastly we abbreviate param for public parameter, PPT for probabilistic polynomial time and PRF for pseudorandom function.

2.1 Multilinear Maps

One of our basic tool is the multilinear map. Garg et al. [14] made a plausible lattices based construction, then Coron et al. [9] made another construction over integers. Their multilinear map is a graded encoding system in fact, here we review an intuitive definition of it. The groups in this paper are all cyclic groups with order \(N=pq\), where pq are both n-bit primes.

Definition 1

(Multilinear Map). Let \(\overrightarrow{G}=(\mathbb {G}_{1},\ldots ,\mathbb {G}_{k})\) be a sequence of cyclic groups each of order N, and \(g_{i}\) be a canonical generator of \(\mathbb {G}_{i}\). There exist a set of bilinear maps \(\{e_{i, j}: \mathbb {G}_i\times \mathbb {G}_j\rightarrow \mathbb {G}_{i+j}|i, j\ge 1\wedge i+j\le k\}\), which satisfy the following operations:

$$\begin{aligned} e_{i,j}(g_i^{a}, g_{j}^{b})=g_{i+j}^{ab}: \forall a, b\in \mathbb {Z}_N. \end{aligned}$$

when the context is obvious, we drop the subscripts i and j, such as, \(e(g_i^{a}, g_j^{b})=g_{i+j}^{ab}\).

Let \(\mathcal {G}(1^{n},k)\) denote a multilinear map generator with a security parameter n and a positive integer k which indicates the required encoding level as its inputs. The output of \(\mathcal {G}(1^{n},k)\) is a multilinear map \(\varGamma _{k}=(N,\mathbb {G}_{1},\ldots ,\mathbb {G}_{k},g_{1},\ldots ,g_{k},e)\) as described before. In a multilinear map setting, multiplication of the exponents in high degree is possible without restriction of degree 2 as in a bilinear map setting.

2.2 Pseudorandom Function

Here, we review a definition of PRF. A PRF consists of two algorithm, KeyGen and \(F_{K}(\cdot )\). Assume that the domain of the PRF is \(\mathcal {X}\) and the range is \(\mathcal {Y}\), KeyGen produces a secret key K while \(F_{K}(\cdot )\) produces \(y\in \mathcal {Y}\) according to K and an input \(x\in \mathcal {X}\). A definition of PRF is as follows:

Definition 2

(PRF). F is a pseudorandom function if for every \(\mathrm {PPT}\) adversary \(\mathcal {A}\), there exists a negligible function \(neg(\cdot )\) such that for all n:

$$\begin{aligned} {\begin{matrix} |Pr[\mathcal {A}^{F_{K}(\cdot )}(1^{n},param)=1]-Pr[\mathcal {A}^{R(\cdot )}(1^{n},param)=1]|\le neg(n) \end{matrix}} \end{aligned}$$

where \(R : \mathcal {X} \rightarrow \mathcal {Y}\) is a random function.

2.3 Computational Assumptions

Let \(\varGamma _{k}=(N,\mathbb {G}_{1},\ldots ,\mathbb {G}_{k},g_{1},\ldots ,g_{k},e)\leftarrow \mathcal {G}(1^{n},k)\) be a k-linear map. We review the (kl)-Multilinear Diffie-Hellman Inversion assumption suggested by Sahai et al. [20]:

Definition 3

((kl)-MDHI). Given \(\varGamma _{k}\) and \(g_{1},g_{1}^{a},\ldots ,g_{1}^{a^{l}}\in \mathbb {G}_{1}\), where \(a\mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\), the advantage of an adversary \(\mathcal {A}\) in finding out \(g_{k}^{a^{kl+1}}\) is

$$\begin{aligned} \mathbf{Adv}_{\mathcal {A}}^{mdhi}=|Pr[\mathcal {A}(\varGamma _{k},g_{1}^{a},\ldots ,g_{1}^{{a^{l}}})=g_{k}^{{a^{kl+1}}}]|. \end{aligned}$$

For any \(\mathrm {PPT}\) adversary \(\mathcal {A}\), there exists a negligible function \(neg (\cdot )\) such that for all n, \(\mathbf {Adv}_{\mathcal {A}}^{mdhi}(n)\le neg(n)\).

The subgroup decisional assumption was first suggested by Boneh et al. [3]. Given \(\mathbb {G}_{i}\) with order \(N=pq\) and \(u\mathop {\longleftarrow }\limits ^{U}\mathbb {G}_{i}\), it is hard to determine whether u belongs to subgroup \(\mathbb {G}^{q}_{i}\) or not.

Definition 4

( \(\mathrm {SDA}_{i}\) ). Given \(\mathbb {G}_{i}\) and \(u\mathop {\longleftarrow }\limits ^{U}\mathbb {G}_{i}\), the advantage of an adversary \(\mathcal {A}\) in determining whether u belongs to subgroup \(\mathbb {G}_{i}^{q}\) or not is

$$\begin{aligned} \mathbf {Adv}_{\mathcal {A}}^{sda_{i}}=|Pr[\mathcal {A}(\mathbb {G}_{i},u)=1]-Pr[\mathcal {A}(\mathbb {G}_{i},u^{p})=1]|. \end{aligned}$$

For any \(\mathrm {PPT}\) adversary \(\mathcal {A}\), there exists a negligible function \(neg(\cdot )\) such that for all n, \(\mathbf {Adv}_{\mathcal {A}}^{sda_{i}}(n)\le neg(n)\).

Zhang et al. [26] proved that subgroup decisional assumption holds for \(\varGamma _{k}\) if \(SDA_{i}\) holds for every \(\mathbb {G}_{i}\), \(i=1,\ldots ,k\).

The last one is the co-computational Diffie-Hellman assumption suggested by Boneh et al. [5].

Definition 5

( \(\mathrm {co}\) -CDH Assumption). Given \(\varGamma _{k}\) and \(g_{1}^{a},g_{2}^{b}\), where \(a, b\mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\), the advantage of an adversary \(\mathcal {A}\) in finding out \(g_{1}^{ab}\) is

$$\begin{aligned} \mathbf {Adv}_{\mathcal {A}}^{cdh}=Pr[\mathcal {A}(\varGamma _{k},g_{1}^{a},g_{2}^{b})=g_{1}^{ab}]. \end{aligned}$$

For any \(\mathrm {PPT}\) adversary \(\mathcal {A}\), there exists a negligible function \(neg (\cdot )\) such that for all n, \(\mathbf {Adv}_{\mathcal {A}}^{cdh}(n)\le neg(n)\).

2.4 Basic Model

Now we review a basic publicly verifiable computation model. The client performs an off-line pre-computation according to the outsourcing function only through the following \(\mathbf {KeyGen}\) algorithm, and then performs an on-line pre-computation on specific inputs through the following \(\mathbf {ProbGen}\) algorithm. The result of the on-line pre-computation should be public to allow a public verification. The server runs the \(\mathbf {Compute}\) algorithm and returns a \(\sigma _{y}\). Any third party can verify the returned computation result and output a value y or an error \(\bot \).

Let \(\mathcal {F}\) be a family of functions. A publicly verifiable computation protocol \(\mathcal {VC}\) for \(\mathcal {F}\) is as follows:

  • \(\mathbf {KeyGen}(1^{n}, f)\rightarrow (SK, PK, EK)\). With a security parameter n and \(f\in \mathcal {F}\), key generation algorithm produces secret key SK, public key PK, and evaluation key EK. Send EK to server. This is the off-line pre-computation on f.

  • \(\mathbf {ProbGen}(PK, SK, x)\rightarrow (\sigma _{x}, VK_{x})\). With an input x in the domain of f, the problem generation algorithm allows the client to produce an input encoding \(\sigma _{x}\) and a public verification key \(VK_{x}\). This is the on-line pre-computation on specific input x.

  • \(\mathbf {Compute}(PK, EK, f, \sigma _{x})\rightarrow \sigma _{y}\). With PK, EK, f and \(\sigma _{x}\), this algorithm allows server to perform a computation on f and return a \(\sigma _{y}\) to the verifier.

  • \(\mathbf {Verify}(PK, VK_{x}, \sigma _{y})\rightarrow y/\bot \). With PK, \(VK_{x}\), and \(\sigma _{y}\), this algorithm allows any party to verify the result and return a value y or an error \(\bot \).

A verifiable computation protocol is secure if it holds the following properties: correctness and soundness. Simply, correctness is the value output by an honest server can be verified correctly.

Definition 6

(Correctness). For any \(f\in \mathcal {F}\), any \((SK, PK, EK)\leftarrow \mathbf {KeyGen}(1^{n}, f)\), any \(x\in Dom(f)\), if \((\sigma _{x}, VK_{x})\leftarrow \mathbf {ProbGen}(PK, SK, x)\) and \(\sigma _{y}\leftarrow \mathbf {Compute}(PK, EK, f, \sigma _{x})\), then the output of \(\mathbf {Verify}(PK, VK_{x}, \sigma _{y})\) is f(x) with all but negligible probability.

Soundness is any PPT adversary \(\mathcal {A}\) cannot persuade a verifier to accept an incorrect computation result. Define the following experiment:

\(\mathbf {Exp}^{\mathrm {PubVer}}_{\mathcal {A}}[\mathcal {VC}, f, l, n]:\)

\((SK, PK, EK)\leftarrow \mathbf {KeyGen}(1^{n},f),\)

For \(i=1\) to l:

   \(x_{i}\leftarrow \mathcal {A}(PK, EK, \sigma _{x, 1}, VK_{x, 1}, \ldots , \sigma _{x, i-1}, VK_{x, i-1})\),

   \((\sigma _{x, i}, VK_{x, i})\leftarrow \mathbf {ProbGen}(PK,SK,x_{i})\);

\(x^{*}\leftarrow \mathcal {A}(PK, EK,\sigma _{x,1},VK_{x,1},\ldots ,\sigma _{x,l},VK_{x,l})\),

\((\sigma _{x^{*}},VK_{x^{*}})\leftarrow \mathbf {ProbGen}(PK,SK,x^{*})\),

\(\widehat{\sigma }_{y}\leftarrow \mathcal {A}(PK,EK,\sigma _{x,1},VK_{x,1},\ldots ,\sigma _{x,l},VK_{x,l},\sigma _{x^{*}}, VK_{x^{*}})\),

\(\widehat{y}\leftarrow \mathbf {Verify}(PK,VK_{x^{*}},\widehat{\sigma }_{y})\),

If \(\widehat{y}\ne \perp \) and \(\widehat{y}\ne f(x^{*})\), output 1, else output 0.

For any \(n\in \mathbb {N}\), any function \(f\in \mathcal {F}\), the advantage of an adversary \(\mathcal {A}\) making at most \(l=poly(n)\) queries in the above experiment against \(\mathcal {VC}\) is

$$\begin{aligned} \mathbf {Adv}^{\mathrm {PubVer}}_{\mathcal {A}}(\mathcal {VC},f,l,n)=Pr[\mathbf {Exp}^{\mathrm {PubVer}}_ {\mathcal {A}}[\mathcal {VC},f,l,n]=1] \end{aligned}$$

Definition 7

(Soundness). A verifiable computation protocol \(\mathcal {VC}\) is sound for \(\mathcal {F}\), if for any \(f\in \mathcal {F}\) and any PPT adversary \(\mathcal {A}\) there exists a negligible function \(neg (\cdot )\) such that for all n, \(\mathbf {Adv}^{\mathrm {PubVer}}_{\mathcal {A}}(\mathcal {VC},f,l,n)\le neg(n)\).

3 Multi-labeled Program

The idea of our work is inspired by Backes et al.’ multi-labeled verifiable computation protocol [2]. Briefly describing the conception of multi-labeled program and its corresponding verifiable computation protocol will help readers appreciate our work more easily.

In a multi-labeled program, a pair of labels \(L=(\varDelta , \tau )\) is used to identify a set of input message, where \(\varDelta \) is data set identifier and \(\tau \) is input identifier. For an instance, if we want to record the weather condition per hour in a day, then we should keep track of temperature, humidity, sunlight and so on hourly. \(\tau =(\tau _{1},\tau _{2},\cdots )\) labels temperature, humidity, sunlight etc. respectively, while \(\varDelta \) labels time. Regard the recordings in each hour as one data set. Different \(\varDelta _{i}\) labels different data sets, then \(\tau \) can be reused to label inputs in different data sets. A pair \(L=(\varDelta , \tau )\) can uniquely identify a set of inputs while any single \(\varDelta \) or \(\tau \) can not. Please refer to [2] for details.

The authors proposed a verifiable computation protocol on quadratic polynomials of m variables using multi-label. The verification cost is the cost of performing a quadratic polynomial on two variables. This verifiable computation protocol is efficient if m is large enough. We briefly review their protocols in the following:

Assume that the outsourcing function f is a quadratic polynomial of m variables. For every input \(x_{i}\), \(i=1,\ldots ,m\), client generates two pairs of pseudorandom values according to their labels such as: \((u_{i},v_{i})\leftarrow F_{K_{1}}(\tau _{i})\), \((a,b)\leftarrow F_{K_{2}}(\varDelta )\), where F is a PRF and \(K_{1},K_{2}\) are secret keys of F. Client chooses \(\alpha \mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\) as its secret key and sets \(y_{0}^{(i)}=x_{i},Y_{1}^{(i)}=(g^{u_{i}a+v_{i}b-x_{i}})^{\frac{1}{\alpha }},Y_{2}^{(i)}=1\in \mathbb {G}_{1}\) for \(i=1,\ldots ,m\), sends m tuples \((y_{0},Y_{1},Y_{2})\) to server. Server computes \(\sigma _{y}\) according to the arithmetic circuit of f gate by gate:

  • \(\mathbf {Addition}\). If the gate is an addition gate, assume values on two input wires are respectively \(y_{0}^{(1)}\) and \(y_{0}^{(2)}\). Compute \((y_{0},Y_{1},Y_{2})\) as follows:

    $$\begin{aligned} y_{0}=y_{0}^{(1)}+y_{0}^{(2)}, Y_{1}=Y_{1}^{(1)}\cdot Y_{1}^{(2)}, \end{aligned}$$
    $$\begin{aligned} Y_{2}=Y_{2}^{(1)}\cdot Y_{2}^{{(2)}}. \end{aligned}$$
  • \(\mathbf {Multiplication.}\) If the gate is a multiplication gate, assume values on two input wires are respectively \(y_{0}^{(1)}\) and \(y_{0}^{(2)}\). Compute \((y_{0},Y_{1},Y_{2})\) as follows:

    $$\begin{aligned} y_{0}=y_{0}^{(1)}\cdot y_{0}^{(2)}, Y_{1}=(Y_{1}^{(1)})^{y_{0}^{(2)}}\cdot (Y_{1}^{(2)})^{y_{0}^{(1)}}, \end{aligned}$$
    $$\begin{aligned} Y_{2}=e(Y_{1}^{(1)},Y_{1}^{(2)}). \end{aligned}$$
  • \(\mathbf {Mulplication}\) \(\mathbf {with}\) \(\mathbf {constant.}\) If the gate is a multiplication gate, the value of one input wire is a constant c, the value of another input wire is \(y_{0}^{(1)}\). Compute \((y_{0},Y_{1},Y_{2})\) as follows:

    $$\begin{aligned} y_{0}=c\cdot y_{0}^{(1)}, Y_{1}=(Y_{1}^{(1)})^{c}, \end{aligned}$$
    $$\begin{aligned} Y_{2}=(Y_{2}^{(1)})^{c}. \end{aligned}$$

After finishing the computation, server sets \(\sigma _{y}=(y_{0},Y_{1},Y_{2})\) and returns it to client. The verification equation is:

$$\begin{aligned} {\begin{matrix} W=e(g,g)^{y_{0}}\cdot e(Y_{1},g)^{\alpha }\cdot Y_{2}^{\alpha ^{2}}, \end{matrix}} \end{aligned}$$
(1)

where W is computed by client in two steps. Firstly, the client performs a pre-computation on the outsourcing quadratic polynomial f to obtain a quadratic polynomial on two variables:

$$\begin{aligned} \rho (z_{1},z_{2})=f(\rho _{1}(z_{1},z_{2}),\ldots ,\rho _{m}(z_{1},z_{2})) \end{aligned}$$

where \(\rho _{i}(z_{1},z_{2})=u_{i}z_{1}+v_{i}z_{2}\), \((u_{i},v_{i})\leftarrow F_{K_{1}}(\tau _{i})\). Then, when the client wants to outsource this polynomial computation on specific inputs, it generates \((a,b)\leftarrow F_{K_{2}}(\varDelta )\) according to data set label \(\varDelta \) and computes \(W=\rho (a,b)\). If Eq. (1) holds, the returned \(\sigma _{y}\) is honestly computed and \(y_{0}\) is the correct computation result. Otherwise, client outputs \(\bot \). This polynomial f can be outsourcing many times on different inputs and the verification cost is the cost of performing a quadratic polynomial computation on two variables. The correctness and soundness of this protocol have been proved by Backes et al. [2].

This verifiable computation protocol can deal with polynomials in degree at most 2 as it is in the setting of bilinear map. If we just extend it to high degree polynomial using multilinear map, the verification cost will be a two variables polynomial of the same high degree. Unfortunately, the decision linear assumption which the protocol reduces the randomness of its PRF on no longer holds in a multilinear map setting. We construct a variant of the PRF which has a better performance in reducing the on-line pre-computation cost while realizing public verification.

4 Our Protocol

In this section, we present a publicly verifiable computation protocol on large polynomials. Assume that the outsourcing polynomial f is of m variables and degree at most d. We follow the idea of multi-labeled program and use a pair of labels \(L=(\varDelta ,\tau _{i})\) to identify input \(x_{i}\), for all \(i=1,\ldots ,m\). In the following, we will introduce our PRF first, then give a detailed verifiable computation protocol built on our PRF.

4.1 PRF with Amortized Closed-Form Efficiency

The randomness of our PRF is based on the subgroup decisional assumption.

PRF:

  • \(\mathrm {KeyGen}(1^{n}\)): Let \(\varGamma _{k}=(N,\mathbb {G}_{1},\ldots ,\mathbb {G}_{k},g_{1},\ldots ,g_{k},e)\leftarrow \mathcal {G}(1^{n},k)\). Choose two secret keys \(k_{1}\), \(k_{2}\) for PRFs \(F_{k_{1,2}}':\{0,1\}^{n}\rightarrow \mathbb {Z}_{N}\). Output \(K=\{p,q,k_{1},k_{2}\}\) and public parameter \(param=\varGamma _{k}\).

  • \(F_{K}(x)\): On input x, generate a pair of values (ab) according to its label \(L=(\varDelta ,\tau )\) such as: \(a\leftarrow F'_{k_{1}}(\tau )\), and \(b\leftarrow F'_{k_{2}}(\varDelta )\), where \(\varDelta \in \{0,1\}^{n}\) and \(\tau \in \{0,1\}^{n}\). Output \(F_{K}(x)=g_{1}^{pab}\).

Theorem 1

If F\('\) is a pseudorandom function and the SDA assumption holds for \(\varGamma _{k}\), then \(\mathrm {PRF}\) is a pseudorandom function.

Proof

The proof follows by a standard hybrid argument.

  • Game 0: this is the real game described above for PRF.

  • Game 1: this is Game 0 except that \(F_{k_{1}}'(\tau )\) is replaced by a random function \(\varPhi _{1}:\{0,1\}^{n}\rightarrow \mathbb {Z}_{N}\). It is easy to argue that Game 1 is indistinguishable with Game 0.

  • Game 2: this is Game 1 except that \(F_{k_{2}}'(\varDelta )\) is replaced by a random function \(\varPhi _{2}:\{0,1\}^{n}\rightarrow \mathbb {Z}_{N}\). Similarly to the previous case, one can easily argue that Game 2 is indistinguishable with Game 1.

  • \(\mathbf {Game (3,j)\mathbf{: }}\) let \(Q_{\varDelta }\) be the upper bound on the number of distinct \(\varDelta \) queried by adversary \(\mathcal {A}\). If \(S=\{\varDelta _{1},\ldots ,\varDelta _{Q_{\varDelta }}\}\) is the ordered set of \(\varDelta \) queried by \(\mathcal {A}\), then, for \(0\le j\le Q_{\varDelta }\), we define the following partial sets of S: \(S_{\le j}=\{\varDelta _{i}\in S:i\le j\}\) and \(S_{>j}=\{\varDelta _{i}\in S:i>j\}\). Then we define Game (3, j) same as Game 2 except that queries \((\varDelta ,\tau )\) where \(\varDelta \in S_{\le j}\) are answered with a random value R chosen uniformly in \(\mathbb {G}_{1}\), whereas queries \((\varDelta ,\tau )\) where \(\varDelta \in S_{>j}\) are answered with \(R=g^{pab}\) where \(a\leftarrow \varPhi _{1}(\tau )\) and \(b\leftarrow \varPhi _{2}(\varDelta )\).

As one can notice, Game (3, 0) is the same as Game 2, while Game \((3,Q_{\varDelta })\) is the game where all queries are answered with freshly random values in \(\mathbb {G}_{1}\), just like \(\mathcal {A}\) is getting access to a truly random oracle from \(\mathcal {X}\) to \(\mathbb {G}_{1}\). If for every \(1\le j\le Q_{\varDelta }\), Game \((3,j-1)\) is computationally indistinguishable from Game (3, j) under the subgroup decisional assumption holds for \(\varGamma _{k}\), the proof can be done. So we prove the following lemma:

Lemma 1

If subgroup decisional assumption holds for \(\varGamma _{k}\), then \(|Pr[G_{3,j-1}]-Pr[G_{3,j}]|\) is negligible for \(1\le j\le Q_{\varDelta }\).

The key tool of our proof is the following lemma which shows the function \(f_{b}(U)=U^{pb}\) is a weak PRF under the subgroup decisional assumption.

Lemma 2

If the subgroup decisional assumption holds for \(\varGamma _{k}\) then function \(f_{b}(U)=U^{pb}\), where \(b\mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\), is a weak PRF.

Proof

For a tuple \((g_{1},g_{1}^{a},g_{1}^{pab})\), we rename \(g_{1}^{a}\) as U and \(g_{1}^{pab}\) as V. Given such (UV), challenger can create polynomially-many binary pairs \((U_{i},V_{i})\) which have the same form, all \(V_{i}\) are random values in subgroup \(\mathbb {G}_{1}^{q}\). If there exist a PPT adversary who can distinguish \(f_{b}(U_{i})\) with a random function, whose output is a random value in \(\mathbb {G}_{1}\), in a non-negligible probability, then the challenger can solve subgroup decisional problem with the same probability.

Proof

(Lemma 1). Now we show that any PPT adversary \(\mathcal {A}\) who has non-negligible probability in distinguish Game \((3,j-1)\) with Game (3, j) can build a PPT challenger \(\mathcal {C}\) who distinguishes the weak PRF \(f_{b}(U)=U^{pb}\) with a random function in the same probability.

\(\mathcal {C}\) receives as input \(param=\varGamma _{k}\) and gets access to an oracle which outputs a binary pair (UV) on each query. Recall that if \(\mathcal {O}=\mathcal {O}_{f}\), then \(V=U^{pb}\) where b is the secret key of the weak PRF f. Otherwise, if \(\mathcal {O}=\mathcal {O}_{R}\), then V is randomly chosen in \(\mathbb {G}_{1}\). In both case, U is randomly chosen at every new query.

\(\mathcal {C}\) runs the simulation for \(\mathcal {A}\) as follows.

Assume that \(Q_{\tau }\) is the upper bound on the number of distinct \(\tau \) queried by \(\mathcal {A}\). Let \((\varDelta ,\tau )\) be query from \(\mathcal {A}\), and assume that \((\varDelta ,\tau )=(\varDelta _{k},\tau _{i})\) for \(1\le k\le Q_{\varDelta }\) and \(1\le i\le Q_{\tau }\). \(\mathcal {C}\) answers \((\varDelta _{k},\tau _{i})\) as follows.

  • If \(k\le j-1\), then \(\mathcal {C}\) chooses \(R\mathop {\longleftarrow }\limits ^{U}\mathbb {G}_{1}\) uniformly and returns R.

  • If \(k> j\), then \(\mathcal {C}\) chooses \(b_{k}\mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\) and queries the oracle \(\mathcal {O}_{f}\). Return \(R=f_{b_{k}}(U_{i})\).

  • If \(k=j\), then \(\mathcal {C}\) returns \(R=V_{i}\)

Basically, the simulator is implicitly setting \(b_{j}=b\) where b is the secret key of the weak PRF f. Let \(G_{3,j}\) be the event that Game (3, j) outputs 1 which is run by adversary \(\mathcal {A}\). Finally, \(\mathcal {C}\) outputs the same bit b as \(\mathcal {A}\) outputs b.

When \(\mathcal {C}\) gets access to the weak PRF, where \(V_{i}=f_{b}(U_{i})\), then \(\mathcal {C}\) is simulating Game \((3,j-1)\). On the other hand, when \(\mathcal {C}\) gets access to a random function, where \(V_{i}\) is random and independent of \(U_{i}\), then \(\mathcal {C}\) simulates the view of Game (3, j). That are \(Pr[\mathcal {C}^{\mathcal {O}_{f}}=1]=Pr[G_{3,j-1}]\) and \(Pr[\mathcal {C}^{\mathcal {O}_{R}}=1]=Pr[G_{3,j}]\). We have:

$$\begin{aligned} |Pr[\mathcal {C}^{\mathcal {O}_{f}}=1]-Pr[\mathcal {C}^{\mathcal {O}_{R}}=1]|=|Pr[G_{3,j-1}]-Pr[G_{3,j}]| \end{aligned}$$

The simulation is perfect, and Lemma 1 has been proved.

The PRF helps to amortize the pre-computation cost. For a specific polynomial f, which is of m variables and in degree d, the client performs the pre-computation in two steps. In Step 1, the client transforms this polynomial to a one variable, degree d polynomial \(\rho \) in a cost \(O((m+1)^{d})\), the same as the cost of performing the computation on f. In Step 2, the client performs a computation on \(\rho \) with cost O(d). Details as follows:

Step 1.

This is off-line pre-computation. Generate \(a_{i}\leftarrow F'_{k_{1}}(\tau _{i})\) according to input identifier \(\tau _{i}\) for \(i=1,\ldots ,m\), where \(F'_{k_{1}}(\cdot )\) is the pseudorandom function to produce an exponent a. Set \(\rho _{i}(z)=pa_{i}\cdot z\) for \(i=1,\ldots ,m\). Obviously, all \(\rho _{i}(z)\) are degree-1 polynomial on variable z with no constant. Perform the computation of f on \(\rho _{1}(z),\ldots ,\rho _{m}(z)\) to get a new one variable, degree d polynomial \(\rho (z)\):

$$\begin{aligned} \rho (z)=f(\rho _{1}(z),\ldots ,\rho _{m}(z)). \end{aligned}$$

It is worth noting that the above computation can be done off-line by client as it is only related to function. The input identifier can be reused many times for a specific polynomial f as long as data set identifier is different. The cost of this step is \(O((m+1)^{d})\).

Step 2.

This is on-line pre-computation. Generate \(b\leftarrow F'_{k_{2}}(\varDelta )\) according to data set identifier, where \(F'_{k_{2}}(\cdot )\) is the pseudorandom function to produce an exponent b. Perform the computation of \(\rho (z)\) on b, the result is \(\rho (b)\) and computation cost is O(d).

When performing Step 2, the input of polynomial f has been identified. Step 2 can be performed many times on different inputs for a specific polynomial f, the cost of off-line pre-computation can be amortized if this polynomial f will be performed many times on different inputs. So, the cost of pre-computation will be low on average.

4.2 Construction

Our verifiable computation protocol on large polynomials utilizes the PRF above. Let f be the outsourcing polynomial, assume it is a polynomial of m variables and in degree d. Details as follows:

  • \(\mathbf {KeyGen}(1^{n},k,f)\rightarrow (SK,PK,EK)\). This is key generation algorithm run by client. Generate a k-linear map, \(\varGamma _{k}=(N, \mathbb {G}_1, \ldots , \mathbb {G}_k\), \(g_{1},\ldots , g_{k}, e)\leftarrow \mathcal {G}(1^{n},k)\), where \(k=d+2\). Choose \(\alpha \mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\) uniformly. Choose secret keys of PRF as described before, \(K=(k_{1},k_{2})\). Run Step 1 to generate a one variable, degree d polynomial \(\rho (z)\). Set \(ek=(ek_{0},ek_{1},\ldots ,ek_{i},\ldots ,ek_{d})\) where \(ek_{i}=g^{\alpha ^{i}}_{d-i+1}\).

    The secret key \(SK=(k_{1},k_{2},p,q,\alpha )\), the public key \(PK=\varGamma _{k}\). The evaluation key \(EK=ek\), send it to server.

  • \(\mathbf {ProbGen}(SK, PK, x)\rightarrow (\sigma _{x},VK_{x})\). This is problem generation algorithm run by client. Run Step 2 to get the result \(\rho (b)\) and set the public verification key as \(VK_{x}=g_{d+2}^{\rho (b)}\).

    Run PRF to get \(R_{i}=g_{1}^{pa_{i}b}\) for each input \(x_{i}\), \(i=1,\ldots ,m\). Set \(\sigma _{i}=(y_{0}^{(i)},Y_{1}^{(i)},Y_{2}^{(i)})\), where \(y_{0}^{(i)}=x_{i}\in \mathbb {Z}_{N}\), \(Y_{1}^{(i)}=(R_{i}\cdot g_{1}^{-x_{i}})^{\frac{1}{\alpha }}\in \mathbb {G}_{1}\), \(Y_{2}^{(i)}=1\in \mathbb {G}_{1}\). Set \(\sigma _{x}=(\sigma _{1},\ldots ,\sigma _{m})\), send it to server.

  • \(\mathbf {Compute}(PK,EK, f, \sigma _{x})\rightarrow \sigma _{y}\). Given the evaluation key EK, \(\sigma _{x}\), PK and the outsourcing polynomial f, server computes a \(\sigma _{y}\) as follows. For our convenience to describe, we interpret f(x) as \(f(x)=\sum _{i=1}^{s}f_{i}p_{i}(x)\), where for each monomial \(f_{i}p_{i}(x)\) we interpret it further as \(f_{i}p_{i}(x)=f_{i}\prod _{j=1}^{d}x_{i_{j}}\), where \(0\le i_{1},\ldots , i_{d}\le m\), \(x_{0}\) denotes constant 1, while \(x_{1},\ldots ,x_{m}\) denote the m variables. Server computes \(\sigma _{y}=(y_{0},Y_{1},Y_{2})\) according to each monomial first and then adds the s triples \((y_{0},Y_{1},Y_{2})\) together, details as follows:

    Initiate \(y_{0}=0\), \(Y_{1}=1\in \mathbb {G}_{d+1}\), \(Y_{2}=1\in \mathbb {G}_{d+1}\).

    For \(i=1,\ldots ,s\):

    If \(i_{1}=\ldots =i_{d}=0\), then:

    \(y_{0i}=f_{i},Y_{1i}=ek_{0},Y_{2i}=ek_{0}\);

    Else, let \(\overline{j}\) be such that \(i_{\overline{j}}\ge 1\) and \(i_{\overline{j}+1}=\dots =i_{d}=0\):

    \(Y_{2i}=e(Y_{1}^{(i_{1})}, Y_{1}^{(i_{2})},\ldots ,Y_{1}^{(i_{\overline{j}})},ek_{\overline{j}})\),

    $$\begin{aligned}{\begin{matrix} Y_{1i}=&{}e(Y_{1}^{(i_{1})},\ldots ,Y_{1}^{(i_{\overline{j}-1})},ek_{\overline{j}-1})^{y_{0}^{(i_{\overline{j}})}} \cdot e(Y_{1}^{(i_{1})},\ldots ,Y_{1}^{(i_{\overline{j}-2})},Y_{1}^{(i_{\overline{j}})},ek_{\overline{j}-1})^{y_{0}^{(i_{\overline{j}-1})}}\\ &{}\cdots e(Y_{1}^{(i_{1})},Y_{1}^{(i_{3})}\ldots ,Y_{1}^{(i_{\overline{j}})},ek_{\overline{j}-1})^{y_{0}^{(i_{2})}} \cdot e(Y_{1}^{(i_{2})},\ldots ,Y_{1}^{(i_{\overline{j}})},ek_{\overline{j}-1})^{y_{0}^{(i_{1})}}\\ {} &{}\cdot e(Y_{1}^{(i_{1})},\ldots ,Y_{1}^{(i_{\overline{j}-2})},ek_{\overline{j}-2})^{y_{0}^{(i_{\overline{j}-1})}\cdot y_{0}^{(i_{\overline{j}})}}\cdots e(Y_{1}^{(i_{3})},\ldots ,Y_{1}^{(i_{\overline{j}})},ek_{\overline{j}-2})^{y_{0}^{(i_{1})}\cdot y_{0}^{(i_{2})}}\\ {} &{}\cdots \\ {} &{}e(Y_{1}^{(i_{1})},ek_{1})^{y_{0}^{(i_{2})}\cdots y_{0}^{(i_{\overline{j}})}}\cdots e(Y_{1}^{(i_{\overline{j}})},ek_{1})^{y_{0}^{(i_{1})}\cdots y_{0}^{(i_{\overline{j}-1})}}, \end{matrix}} \end{aligned}$$

    \(y_{0i}=y_{0}^{(i_{1})}\cdots y_{0}^{(i_{\overline{j}})}\),

    set \(y_{0i}=f_{i}y_{0i}\), \(Y_{1i}=(Y_{1i})^{f_{i}}\), and \(Y_{2i}=(Y_{2i})^{f_{i}}\);

    set \(y_{0}=y_{0}+y_{0i}\), \(Y_{1}=Y_{1}\cdot Y_{1i}\), and \(Y_{2}=Y_{2}\cdot Y_{2i}\).

    Server sets \(\sigma _{y}=(y_{0},Y_{1},Y_{2})\) and returns \(\sigma _{y}\) to verifier.

  • \(\mathbf {Verify}(PK,VK_{x},\sigma _{y})\rightarrow y/\bot \). Any third party who wants to verify the result checks the following equation:

    $$\begin{aligned} {\begin{matrix} g_{d+2}^{y_{0}}\cdot e(Y_{1},g_{1})\cdot e(Y_{2},g_{1})=VK_{x} \end{matrix}} \end{aligned}$$
    (2)

    If the equation holds, verifier outputs \(y_{0}\) as the correct computation result. Otherwise, outputs an error symbol \(\bot \).

First we show the correctness of the protocol briefly. Recall that \(ek_{i}=g_{d-i+1}^{\alpha ^{i}}\), if \(\sigma _{y}\) is honestly calculated by server, there is

$$\begin{aligned} g_{d+2}^{y_{0}}\cdot e(Y_{1},g_{1})\cdot e(Y_{2},g_{1})=g_{d+2}^{\rho (b)}, \end{aligned}$$
(3)

Notice that \(VK_{x}=g_{d+2}^{\rho (b)}\), then Eq. (2) holds. The honest result returned from server can be verified correctly.

Now we show the soundness of our protocol. If (kl)-MDHI assumption holds in \(\varGamma _{k}\), any PPT adversary can’t get any secret keys from public key PK and evaluation key EK.

Theorem 2

If co-CDH assumption holds in \(\varGamma _{k}\), then any PPT adversary \(\mathcal {A}\) making at most \(l=poly(n)\) queries has advantage

$$\begin{aligned} Adv_{\mathcal {A}}^{PubVer}(\mathcal {VC},f,l,n)\le neg(n), \end{aligned}$$

where \(neg(\cdot )\) is a negligible function.

Proof

The proof follows by a standard hybrid argument based on the following games:

  • Game 0: this is the real game same as \(\mathbf {Exp}_{\mathcal {A}}^{PubVer}(\mathcal {VC},f,l,n)\).

  • Game 1: this is Game 0 except for the following change in the evaluation of \(\rho (b)\). For any x asked by the adversary during the game, instead of computing \(\rho (b)\) using the Step 1 and Step 2, which is efficient in an amortized notion, an inefficient one step evaluation \(\rho (b)=f(\rho _{1}(b),\ldots ,\rho _{m}(b))\) is used. One can easily argue that Game 1 is indistinguishable with Game 0.

  • Game 2: this is Game 1 except that PRF is replaced by a truly random function \(R:\{0,1\}^{n}\times \{0,1\}^{n}\rightarrow \mathbb {G}_{1}\). Let R be a set of m random values generated by this random function where R is a set of m numbers. One can easily argue that Game 2 is indistinguishable with Game 1 as the randomness of our PRF.

Now we show if there exists a PPT adversary \(\mathcal {A}\) who can win in Game 2 with a non-negligible probability, then there is a challenger \(\mathcal {C}\) who can solve the co-CDH problem with the same probability.

\(\mathcal {C}\) takes as input a group description \(\varGamma _{k}\), chooses \(r\mathop {\longleftarrow }\limits ^{U}\mathbb {Z}_{N}\). For a query \(x=(x_{1},\ldots ,x_{m})\) from \(\mathcal {A}\), \(\mathcal {C}\) chooses m random values \(\beta _{1},\ldots ,\beta _{m}\in \mathbb {Z}_{N}\) , sets \(R^{(i)}=g_{1}^{\beta _{i}}\), for \(i=1,\ldots ,m\). all \(R^{(i)}=g_{1}^{\beta _{i}}\) are random values in \(\mathbb {G}_{1}\). Set \(\sigma _{x}=(\sigma _{1},\ldots ,\sigma _{m})\) where \(\sigma _{i}=(y_{0}^{(i)},Y_{1}^{(i)},Y_{2}^{(i)})\), \(y_{0}^{(i)}=x_{i}\), \(Y_{1}^{(i)}=(R^{(i)}\cdot g_{1}^{-x_{i}})^{\frac{1}{r}}\), \(Y_{2}^{(i)}=1\in \mathbb {G}_{1}\). Set \(ek=(ek_{0},ek_{1},\ldots ,ek_{i},\ldots ,ek_{d})\) where \(ek_{i}=g^{r^{i}}_{d-i+1}\). \(\mathcal {C}\) computes \(VK_{x}=g_{d+2}^{f(\beta _{1},\ldots ,\beta _{m})}\) and returns \(VK_{x}\) and \(\sigma _{x}\) to \(\mathcal {A}\). The distribution of \(VK_{x}\) and \(\sigma _{x}\) are exactly the same as the one in Game 2.

Finally, let \(\sigma _{y}^{*}=(y_{0}^{*},Y_{1}^{*},Y_{2}^{*},W^{*})\) be the output of \(\mathcal {A}\) at the end of the game, such that for some \(x^{*}\) chosen by \(\mathcal {A}\) it holds \(\mathbf {Verify}(PK,VK_{x^{*}},\sigma _{y^{*}})=y^{*}\), \(y^{*}\ne \bot \) and \(y^{*}\ne f(x^{*})\). By verification, this means that

$$\begin{aligned} g_{d+2}^{y_{0}^{*}}\cdot e(Y_{1}^{*}\cdot Y_{2}^{*},g_{1})=VK_{x}. \end{aligned}$$
(4)

Let \(\sigma _{y}=(y_{0},Y_{1},Y_{2},W)\) be the correct output of the computation. Then, by correctness it also holds:

$$\begin{aligned} g_{d+2}^{y_{0}}\cdot e(Y_{1}\cdot Y_{2},g_{1})=VK_{x}. \end{aligned}$$
(5)

Dividing the verification Eq. (4) by (5),

$$\begin{aligned} g_{d+2}^{y_{0}^{*}-y_{0}}= e(Y_{1}/Y_{1}^{*}\cdot Y_{2}/Y_{2}^{*},g_{1}). \end{aligned}$$
(6)

That is, for a false \(y_{0}^{*}\), \(\mathcal {A}\) can find a \(Y_{1}^{*}\) and a \(Y_{2}^{*}\) to satisfy Eq. (6) in a non-negligible probability, then \(\mathcal {B}\) solves the co-CDH problem with the same probability.

5 Conclusion

In this paper, we propose a delegated computation protocol on high degree polynomials over a large amount of variables which allows public verification. Assume that the delegated polynomial is of m variables and degree at most d. The off-line pre-computation cost is \(O((m+1)^{d})\), same as the cost of performing the outsourcing polynomial computation. The on-line pre-computation cost is O(d) in addition with a multilinear map operation. Using the notion of amortization, off-line pre-computation cost can be amortized if the client delegates the same function f several times on different inputs. This protocol is efficient in average.