1 Introduction

A zero-knowledge proof is a protocol in which a prover convinces a verifier that a statement is true, while conveying no other information apart from its truth. Zero-knowledge proofs have been among the most useful and studied primitives in cryptography since their advent in the 80s. Their popularity has increased even more in recent times, propelled by new applications motivated by blockchain technologies. This context has highlighted the relevance of a particular flavour of zero-knowledge proof, known as zero-knowledge succinct non-interactive argument of knowledge, or zk-SNARK.

The flexibility and efficiency of zk-SNARKs allow to provide practical arguments of knowledge for relations that lack any kind of algebraic structure, for instance the preimage relation for a one-way function. However, it is well known [Wee05] that under standard complexity assumptions, succinct non-interactive arguments do not exist unless some kind of setup is assumed, such as a common reference string. This either requires a trusted third party or the execution of heavy MPC protocols if the setup relies on secret randomness.

For this reason, transparent SNARKs have been proposed, whose setup involves only publicly generated randomness. Many constructions of transparent setup SNARKs have been proposed in recent years, both based on asymmetric [BCC+16, WTS+18, BBB+18, BFS20] and symmetric [AHIV17, BBHR18b, BCR+19, COS20, Set20, BFH+20] cryptographic techniques.

In this work we focus on this latter type of constructions and remark that all cited works in this category are built in (variants of) the Interactive Oracle Proof framework presented in [BCS16] and independently in [RRR16] as “interactive PCP”. Moreover they all address directly or indirectly the \( \textsf{NP} \)-complete rank 1 constraint system satisfiability problem. An easier to state variant asks to prove, given \(A, B, C \in \mathbb {F}^{m,n}\) and \(\textbf{b} \in \mathbb {F}^m\), the existence of a vector \(\textbf{z} \in \mathbb {F}^n\) such that \(A \textbf{z} *B \textbf{z} = C \textbf{z} + \textbf{b}\), where \(*\) is the component-wise multiplication of vectors in \(\mathbb {F}^m\).

An IOP is an interactive proof where the verifier has oracle access to some strings provided by the prover. Its relation to zk-SNARKs stems from the results in [BCS16] where it was shown that any IOP can be efficiently compiled into a non-interactive argument in the random oracle model by using Merkle trees [Mer90]. Moreover the transformation, which can be seen as a generalization of the reduction in [Mic94] from PCP, preserves zero knowledge and knowledge soundness. In particular, IOPs can be used to construct zk-SNARKs.

Unfortunately, the IOP constructions above cannot be directly instantiated for every field choice as they extensively use Reed-Solomon codes, that requires the existence of enough points in \(\mathbb {F}\) and, even worse, the soundness error is often greater than \(|\mathbb {F}|^{-1}\) due to polynomial identity tests which implies \(|\mathbb {F}| > 2^\lambda \) with \(\lambda \) security parameter. This leaves out, for example, the case of R1CS over \(\mathbb {F}_2\). This case is actually interesting as some hash functions and encryption schemes can be interpreted as boolean circuits with relative ease, and then translated to a R1CS. A straight-forward way to overcome this problem, mentioned in [AHIV17], is to simply embed \(\mathbb {F}_2\) in a larger field \(\mathbb {F}_{2^e}\), for large enough e (where at least \(e>\lambda \)) and add constraints of the kind \(z_i^2 = z_i\) for \(i = 1, \ldots , n\) to ensure that the witness entries belongs to \(\mathbb {F}_2\),Footnote 1 and then execute the protocol for R1CS over the larger field.

However this approach seems wasteful, as elements of \(\mathbb {F}_{2^e}\) which in principle could encode up to e bits of information are used to represent only one element of \(\mathbb {F}_2\). Also, operations over \(\mathbb {F}_{2^e}\) are more expensive than those over \(\mathbb {F}_2\). Finally one needs the aforementioned additional constraints on the witness, which increase the size of the system.

Since \(\mathbb {F}_{2^e}\) is an e-dimensional vector space over \(\mathbb {F}_2\), one attempt to improve this would be to interpret vectors in \(\mathbb {F}_2^e\) as elements over the larger field \(\mathbb {F}_{2^e}\). While this would work for systems that only involve additions (XORs), it fails in general when multiplications (ANDs) are needed too.Footnote 2 The technical issue is that for \(e > 1\), the ring \(\mathbb {F}_2^e\), considered with component-wise addition and multiplication, cannot be embedded via a ring homomorphism into \(\mathbb {F}_{2^e}\) (nor any other finite field) since \(\mathbb {F}_2^e\) contains zero divisors while fields do not.

A better approach was presented in BooLigero [GSV21] for the case of Ligero [AHIV17]. Their technique allows to encode e bits into roughly \(\sqrt{e}\) field elements in \(\mathbb {F}_{2^e}\), so that one can use Ligero over \(\mathbb {F}_{2^e}\) to treat \(\sqrt{e}\) times larger statements over \(\mathbb {F}_2\) than the “naïve” method, with roughly the same R1CS size. This however motivates the following question: can we find embeddings of \(\mathbb {F}_2^k\) into \(\mathbb {F}_{2^e}\) with a larger embedding rate k/e which allow to produce more efficient IOPs for R1CS over \(\mathbb {F}_2\) given an IOP for R1CS over \(\mathbb {F}_{2^e}\)?

1.1 Our Contributions

In this work we answer the above question in the affirmative using a more efficient embedding that allows us to encode \(k \ge e/4\) bits into an element of \(\mathbb {F}_{2^e}\). We then present a construction of an IOP for \(\mathbb {F}_2\)-R1CS satisfiability which makes black-box use of any IOP satisfying mild assumptions for R1CS over larger fields. This leads us to reducing Aurora’s argument size up to \(1.31 - 1.65 \times \) and Ligero’s argument size up to \(3.71 \times \).

More concretely, we can use any Reed Solomon encoded IOP, a variant of IOP introduced in [BCR+19], that provides two commonly used sub-protocols: a generalised lincheck, which tests linear relations of the form \(A_1 \textbf{x}_1 + \ldots + A_n \textbf{x}_n = \textbf{b}\) when the verifier has only oracle access to Reed Solomon codewords encoding \(\textbf{x}_i\), and a rowcheck, which tests quadratic relations \(\textbf{x} *\textbf{y} = \textbf{z}\) when the verifier has oracle access to encodings of \(\textbf{x}, \textbf{y}, \textbf{z}\). This includes LigeroFootnote 3, Aurora [BCR+19]Footnote 4 and Ligero++ [BFH+20] up to minor manipulations to transform their lincheck, see the full version [CG21].

In a nutshell, our embedding technique relies primarily on two components: first, the use of reverse multiplication friendly embeddings (RMFE), introduced in the MPC literature in [CCXY18] and independently in [BMN18] and used in several subsequent works [DLN19, CG20, PS21, DGOT21, ACE+21]. Such algebraic device maps a vector from \(\mathbb {F}_2^k\) into an element of a larger field \(\mathbb {F}_q = \mathbb {F}_{2^e}\) in a manner such that field additions and products of two encodings in \(\mathbb {F}_{q}\) still encode the component-wise additions and products of the originally vectors from \(\mathbb {F}_2^k\), even though the map is not a ring homomorphism. For \(k < 100\) we can get RMFEs with \(e \approx 3.3k\) (or \(e=4k\) if we insist on e being a power of 2). Second, the notion of modular lincheck, an IOPP which we introduce in Sect. 3.3 and that we believe is of independent interest, to test linear relations modulo an \(\mathbb {F}_2\) vector space V contained in \(\mathbb {F}_q\), i.e. equations of the form \(A \textbf{x} = \textbf{b} \mod V^n\) (meaning that each coordinate of the vector \(A \textbf{x} - \textbf{b}\) is in V).

In conclusion for each of the aforementioned schemes we compare known adaptations to \(\mathbb {F}_2\)-R1CSs with our general reduction both in terms of argument size and prover complexity. Regarding the proof size we estimate it numerically, see our Python implementation at [Git21]. Regarding prover time we estimate it asymptotically, predicting an improvement factor of \(24.7 \times \) for Aurora and between \(6.9 - 32.5 \times \) for Ligero without interactive repetitions.

1.2 Techniques

Reverse Multiplication Friendly Embeddings. A \((k,e)_p\)-RMFE is a pair of \(\mathbb {F}_p\)-linear maps \(\varphi :\mathbb {F}_p^k\rightarrow \mathbb {F}_{p^e}\) and \(\psi :\mathbb {F}_{p^e}\rightarrow \mathbb {F}_p^k\) satisfying \(\textbf{x}*\textbf{y}=\psi (\varphi (\textbf{x})\cdot \varphi (\textbf{y})) \) for all \(\textbf{x},\textbf{y} \in \mathbb {F}_p^{k}\), where \(*\) denotes the component-wise product. The properties automatically imply that \(\varphi \) is injective, hence the name embedding. Note that \(\varphi \) is not necessarily a ring homomorphism, i.e. \(\varphi (\textbf{x}*\textbf{y})\ne \varphi (\textbf{x})\cdot \varphi (\textbf{y})\) in general. In this paper we extend the notation blockwise to \(\varPhi : (\mathbb {F}_p^k)^n \rightarrow \mathbb {F}_{p^e}^n\) given by \(\varPhi (\textbf{x}_1, \ldots , \textbf{x}_n) = (\varphi (\textbf{x}_1), \ldots , \varphi (\textbf{x}_n))\) and \(\varPsi : \mathbb {F}_{p^e}^n \rightarrow (\mathbb {F}_p^k)^n\) given by \(\varPsi ( x_1, \ldots , x_n ) = (\psi (x_1), \ldots , \psi (x_n))\). These satisfy then \(\textbf{x} *\textbf{y} = \varPsi (\varPhi (\textbf{x}) *\varPhi (\textbf{y}))\) for all \(\textbf{x}, \textbf{y}\in (\mathbb {F}_p^k)^n = \mathbb {F}_p^{kn}\), where the component-wise product on the right side is on \(\mathbb {F}_{p^e}^n\).

From \(\mathbb {F}_2\) -R1CS to a System of Statements Over \(\mathbb {F}_q\). A key ingredient of our result is how to translate the system \(A_1 \textbf{w} *A_2 \textbf{w} = A_3 \textbf{w} + \textbf{b}\) over \(\mathbb {F}_2\) into an equivalent set of relations over \(\mathbb {F}_q\) that can be efficiently checked. Even with the RMFE in hand, this is not trivial because \(\varphi \) (consequently \(\varPhi \)) is neither a ring homomorphism nor surjective.

Defining \(\textbf{x}_i = A_i \textbf{w}\), we can split the above statement into the three linchecks \(A_i \textbf{w} = \textbf{x}_i\) and the rowcheck \(\textbf{x}_1 *\textbf{x}_2 = \textbf{x}_3 + \textbf{b}\). The prover will start by embedding \({{}\widetilde{\textbf{w}}} = \varPhi (\textbf{w}) \in \mathbb {F}_q^{n/k}\) and \({{}\widetilde{\textbf{x}}}_i = \varPhi (\textbf{x}_i)\). We then need to deal with the following:

First of all, because \(\varPhi \) is not surjective, we need additional constraints to ensure \({{}\widetilde{\textbf{w}}}, {{}\widetilde{\textbf{x}}}_i\) lie in the image of \(\varPhi \). We can write these in the form \(I_{n/k} \cdot {{}\widetilde{\textbf{w}}} \in ({\textrm{Im}}\,\varphi )^{n/k}\) and \(I_{m/k} \cdot {{}\widetilde{\textbf{x}}}_i \in ({\textrm{Im}}\,\varphi )^{m/k}\) (where \(I_{\ell }\) is the \(\ell \) by \(\ell \) identity matrix).

Then, because \(\varPhi \) is not a ring homomorphism, we can not simply translate \(\textbf{x}_1 *\textbf{x}_2 = \textbf{x}_3 + \textbf{b}\) into \({{}\widetilde{\textbf{x}}}_1 *{{}\widetilde{\textbf{x}}}_2 = {{}\widetilde{\textbf{x}}}_3 + \varPhi (\textbf{b})\), as this is not true in general. Instead, we need to use the RMFE “product recovery map” \(\psi \). Setting \(\textbf{t} = {{}\widetilde{\textbf{x}}}_1 *{{}\widetilde{\textbf{x}}}_2\), we show that the rowcheck statement is equivalent to the modular linear relation \(\textbf{t} - u\cdot {{}\widetilde{\textbf{x}}}_3 = u\cdot \varPhi (\textbf{b}) \mod ({\textrm{Ker}}\,\psi )^{m/k}\) where \(u = \varphi (\textbf{1})\in \mathbb {F}_q\), \(\textbf{1}\) is the all-one vector and \({\textrm{Ker}}\,\) denotes the kernel.

Similarly, we show that each lincheck \(A_i \textbf{w} = \textbf{x}_i\) can be translated into \({{}\widetilde{A_i}} {{}\widetilde{\textbf{w}}} - {{}\widetilde{I_m}} {{}\widetilde{\textbf{x}}}_i\in ({\textrm{Ker}}\,S \circ \psi )^m\), where \({{}\widetilde{A_i}}, {{}\widetilde{I_m}}\) are the result of applying \(\varPhi \) to \(A_i, I_m\) row-wise and S is the map summing the k components of a vector in \(\mathbb {F}_2^k\).

Modular Linear Test. The sketched characterization above implies that providing a way to test linear modular relations over \(\mathbb {F}_q\) yields the desired IOP as the prover could provide oracle access to encodings of \({{}\widetilde{\textbf{w}}}, {{}\widetilde{\textbf{x}}}_1, {{}\widetilde{\textbf{x}}}_2, {{}\widetilde{\textbf{x}}}_3, \textbf{t}\) and then convince the verifier that all those constraints are satisfied. To test \(\textbf{x} = \textbf{0} \mod V^n\), a standard approach would consist in proving that a random linear combination of its coordinates belongs to V. However, we are dealing with a \(\mathbb {F}_2\)-vector space, and this translates into a soundness error of 1/2. In order to decrease it to \(2^{-\lambda }\), we could check \(\lambda \) independent linear combinations, which involves \(\lambda n\) random bits. In Sect. 3.3 we describe how to reduce the required random bits to \(\varTheta (\lambda )\) by using a certain family of almost universal linear hash functions, and achieve zero knowledge by adding a masking term.

Optimizations. The above techniques require a total of 8 modular linchecks and a rowcheck. In Sect. 4, we introduce several modifications, the main of which is to reduce the number of modular linchecks to just 3. The observation is that we can test several equations of the form \(A\textbf{x}_i=\textbf{b}_i \mod V^{n_i}\) (with common V) all at once by checking \(\sum R_i (A\mathbf {x_i}-\textbf{b}_i)\in V^\lambda \) for appropriately chosen matrices \(R_i\). Additionaly, we compress messages sent by the prover using the structure of these vector spaces V, which comes from our use of an RMFE.

1.3 Other Related Work

Our work provides a significant reduction of the proof size with respect to BooLigero [GSV21]. Applying our construction to Ligero for an \(\mathbb {F}_2\)-R1CS consisting of \(2^{20}\) constraints we measure proofs \(3.71 \times \) shorter than plain Ligero and \(3.03 \times \) smaller than BooLigero. We also stress that in contrast to [GSV21] we present a general reduction that can be applied to a larger class of protocols.

Regarding the use of RMFE, to the best of our knowledge only the recent work [DGOT21] applied this tool in the IOP framework (see their Appendix A). However, their use is restricted to their own protocol, which follows the MPC-in-the-head paradigm introduced in [IKOS07], and cannot be applied directly to other existing IOPs such as Aurora. Furthermore, this optimisation is only considered in the multi-instance case while in our work we manage to integrate the RMFE also for a single instance.

We also remark that even though our construction captures essentially any IOPs that provides a lincheck and a rowcheck, it still cannot be applied out of the box to zk-SNARKS with preprocessing such as Fractal [COS20] or Spartan [Set20]. The reason is that we use the given linchecks to test a randomised relation depending on the random coins of the verifier. This significantly affects the usefulness of any pre-computation. We believe however that this issue can be overcome in a non black-box way with different techniques, a problem that we leave for future work.

2 Preliminaries

The set \(\{1, \ldots , n \}\) is called \({[n]}\). Vectors are denoted with boldface font. \(\textbf{v} *\textbf{w}\) denotes the coordinate-wise product of two vectors of the same length, and \(\left\Vert \textbf{v}\right\Vert \) is the Hamming weight of \(\textbf{v}\). \(\textbf{1}_{k}\) is the vector of k 1’s. Matrices are denoted with capital letters, \(A^\top \) is the transpose of A and \(I_n\) is the n by n identity matrix. Given q a prime power, \(\mathbb {F}_q\) is a field of q elements. When \(q=p^e\), \(\mathbb {F}_p\) can be seen as a subset of \(\mathbb {F}_q\) and \(\mathbb {F}_q\) can be treated as an \(\mathbb {F}_p\) vector space of dimension e. \(V \le \mathbb {F}_q\) means that V is an \(\mathbb {F}_p\)-vector subspace of \(\mathbb {F}_q\). \(a = b \mod V\) means that \(a - b \in V\), and for vectors of length m, \(\textbf{a} = \textbf{b} \mod V^m\) iff \(a_i=b_i \mod V\) for all \(i\in {[m]}\). Given an \(\mathbb {F}_p\)-linear map \(L : V \rightarrow W\) its kernel is \({\textrm{Ker}}\,L = \{\textbf{x} \in V : L(\textbf{x}) = 0\}\) and its image is \({\textrm{Im}}\,L = \{\textbf{y} \in W: \textbf{y} = L(\textbf{x}) \text { for some } \textbf{x}\in V\}\). Given a polynomial \({}\widehat{f} \in \mathbb {F}_q[x]\) and \(L \subseteq \mathbb {F}_q\) we denote \({{}\widehat{f}}_{|L} = ({}\widehat{f}(\alpha ))_{\alpha \in L}\) its evaluation over L. The Reed-Solomon code over L of rate \(\rho \in [0, 1]\) is the set \( \textsf{RS} _{{\mathbb {F}_q}, {L}, {\rho }} {:}{=}\{ {{}\widehat{f}}_{|L} : {}\widehat{f} \in \mathbb {F}_q[x], \; \deg {}\widehat{f} < \rho |L| \}\). We will typically encode vectors \(\textbf{v}\) of length \(m < \rho |L|\) as codewords from \( \textsf{RS} _{{\mathbb {F}_q}, {L}, {\rho }}\) by sampling a \(f \in \textsf{RS} _{{L}, {\rho }}\) such that \({{}\widehat{f}}_{|H} = \textbf{v}\). \(\mathbb {F}_q^H\) denotes the set of vectors over \(\mathbb {F}_q\) with coordinates indexed by H and \(\mathbb {F}_q^{H_1 \times H_2}\) is the set of matrices with rows and columns indexed by \(H_1\) and \(H_2\) respectively. Finally \(\textrm{FFT}(\mathbb {F}, n)\) denotes the number of field operations required to perform a fast Fourier transform over a set of size n, see [GM10].

2.1 Reverse Multiplication Friendly Embedding

We now recall the notion of reverse multiplication friendly embedding from [CCXY18]. Its purpose is to ‘reconcile’ the coordinate-wise multiplicative structure of a ring \(\mathbb {F}_p^k\) and the finite field structure of an extension \(\mathbb {F}_{p^e}\) of \(\mathbb {F}_p\).

Definition 1

Given a prime power p and \(k, e \in \mathbb {N}\) a Reverse Multiplication-Friendly Embedding, denoted \((k,e)_p\)-RMFE, is a pair of \(\mathbb {F}_p\)-linear maps \(\varphi : \mathbb {F}_p^k \rightarrow \mathbb {F}_{p^e}\), \(\psi : \mathbb {F}_{p^e} \rightarrow \mathbb {F}_p^k \) such that for all \(\textbf{x}, \textbf{y} \in \mathbb {F}_p^k\), it holds that

$$ \textbf{x} *\textbf{y} = \psi ( \varphi (\textbf{x}) \cdot \varphi (\textbf{y}) ). $$

That is, one can embed \(\mathbb {F}_p^k\) into \(\mathbb {F}_{p^e}\) via a linear map \(\varphi \) so that the product in \(\mathbb {F}_{p^e}\) of the images of any two vectors \(\textbf{x}, \textbf{y}\) carries information about their component-wise product \(\textbf{x} *\textbf{y}\), and this can be recovered applying \(\psi \) to that field product. However, \(\varphi \) is in general not a ring homomorphism and therefore \(\psi \ne \varphi ^{-1}\). For notational convenience, we extend both \(\varphi \) and \(\psi \) to maps \(\varPhi \), \(\varPsi \) as follows. Given vectors \(\textbf{x} = (\textbf{x}_1, \ldots , \textbf{x}_n) \in (\mathbb {F}_p^k)^n\) and \(\textbf{z}=(z_1,\dots ,z_n) \in (\mathbb {F}_{p^e})^n\) we define

$$ \varPhi (\textbf{x}) \; {:}{=}\; ( \varphi (\textbf{x}_1), \ldots , \varphi (\textbf{x}_n) ) \in (\mathbb {F}_{p^e})^n, \qquad \varPsi (\textbf{z}) \; {:}{=}\; ( \psi (z_1), \ldots , \psi (z_n) ) \in (\mathbb {F}_p^k)^n. $$

The following properties of these extended functions will be key in Sect. 3.1 to transform a \(\mathbb {F}_2\)-R1CS system into a system of equations over \(\mathbb {F}_{2^e}\). Note in particular (3) and (4) characterize respectively coordinatewise and inner products over \(\mathbb {F}_p\) in terms of the corresponding operations over \(\mathbb {F}_{p^e}\). The lemma follows quite directly from the definitions and a proof appears in the full version [CG21].

Lemma 1

The following holds for all positive \(n \in \mathbb {N}\):

  1. 1.

    The maps \(\varphi \) and \(\varPhi \) are injective. The maps \(\psi \) and \(\varPsi \) are surjective.

  2. 2.

    For all \(\textbf{x}\), \(\textbf{y} \in (\mathbb {F}_p^k)^n\), \(\textbf{x} *\textbf{y} = \varPsi ( \varPhi (\textbf{x}) *\varPhi (\textbf{y}) )\) where the \(*\) product in the right-hand side is component-wise in \((\mathbb {F}_{p^e})^n\), i.e. in each component we use the field product in \(\mathbb {F}_{p^e}\).

  3. 3.

    Let \(u = \varphi (\textbf{1}_k)\in \mathbb {F}_{p^e}\).Footnote 5 Then for all \(\textbf{x} \in (\mathbb {F}_p^k)^n\) we have \(\textbf{x} = \varPsi (u \cdot \varPhi (\textbf{x}))\).

  4. 4.

    Let \(S : \mathbb {F}_p^k \rightarrow \mathbb {F}_p\) be given by \(S(x_1, x_2, \dots , x_k) = x_1 + x_2 + \dots + x_k\). Then for all \(\textbf{x}\), \(\textbf{y} \in (\mathbb {F}_p^k)^n\), the inner product \(\textbf{x}^\top \textbf{y}\) can be written as

    $$ \textbf{x}^\top \textbf{y} = S \circ \psi ( \varPhi (\textbf{x})^\top \varPhi (\textbf{y})) $$

As for the existence of RMFEs, in our case of interest \(p=2\) one can obtain the following parameters by concatenation of polynomial interpolation techniques [CCXY18, CG20] (for asymptotics and other results see the full version [CG21]):

Lemma 2

For all \(r\le 33\), there exists a \((3r,10r)_2\)-RMFE. For all \(a \le 17\) there exists a \((2a, 8a)_2\)-RMFE. For all \(b \le 65\) there exists a \((3b, 12b)_2\)-RMFE.

This yields RMFEs with parameters (48, 192), (48, 160) and (32, 128), setting \(r = a = b = 16\), that we will concretely use to evaluate our reduction.

2.2 R1CS, Lincheck and Rowcheck

We now recall the main relations used in recent IOP-basedFootnote 6 SNARKs like [BCR+19, AHIV17]. The first one is the rank 1 constraints system, or R1CS, that defines an NP-complete language closely related to arithmetic circuit satisfiability. Here we present an equivalent affine version that requires for \(A_1, A_2, A_3 \in \mathbb {F}^{m, n}\) and \(\textbf{b} \in \mathbb {F}^m\) to exhibit a vector \(\textbf{w} \in \mathbb {F}^n\) such that \(A_1 \textbf{w} *A_2 \textbf{w} = A_3 \textbf{w} + \textbf{b}\). Formally

Definition 2

We define the affine R1CS relation as the set

$$ \mathcal {R}_{ \textsf{R1CS} }= \{ ((\mathbb {F}, m, n, A_1, A_2, A_3, \textbf{b}), \textbf{w}) \, : \, A_i \in \mathbb {F}^{m,n} , \; A_1 \textbf{w} *A_2 \textbf{w} = A_3 \textbf{w} + \textbf{b}\}. $$

Instead of directly providing a proof system for R1CS, two intermediate relations, lincheck and rowcheck, are defined and for which [BCR+19] constructs RS-encoded IOPPs.Footnote 7 These are then used as building blocks to produce a RS-encoded IOP for the R1CS relation, which in turn can be combined with a low degree test, such as FRI [BBHR18a] or [BGKS20], to make a standard IOP for R1CS. The complexity of this reduction depends on the so-called max rates, two parameters related to the degrees of polynomials and the relations which are tested

The lincheck relation requires that the witnesses \(f_1, f_2 \in \textsf{RS} _{{L}, {\rho }}\) encode over \(H_1, H_2 \subseteq \mathbb {F}_q\) two vectors \(\textbf{x}_1, \textbf{x}_2\) (i.e. \({{}\widehat{f}_i}_{|H_i} = \textbf{x}_i\)) which satisfy a given linear constraint \(M\textbf{x}_1=\textbf{x}_2\). The rowcheck relation requires that witnesses \(f_1, f_2, f_3 \in \textsf{RS} _{{L}, {\rho }}\) encode over \(H \subseteq \mathbb {F}_q\) three vectors \(\textbf{x}_1, \textbf{x}_2, \textbf{x}_3\) such that \(\textbf{x}_1 *\textbf{x}_2 = \textbf{x}_3\). For efficiency reasons, depending on the concrete instantiations of Aurora and FRI, in both definitions below \(L, H_1, H_2, H\) are taken to be \(\mathbb {F}_2\)-affine subspaces of \(\mathbb {F}_q\).

Definition 3

We define \(\mathcal {R}_{ \textsf{Lin} _{}}\) as the set of tuples \(((\mathbb {F}_q, L, H_1, H_2, \rho , M), (f_1, f_2))\) such that \(L, H_i \subseteq \mathbb {F}_q\) are affine subspaces, \(H_i \cap L = \varnothing \) for \(i \in \{1, 2\}\), \(f_i \in \textsf{RS} _{{L}, {\rho }}\), \(M \in \mathbb {F}_q^{H_1 \times H_2}\) and the linear relationship \({{}\widehat{f}_1}_{|H_1} = M \cdot {{}\widehat{f}_2}_{|H_2}\) holds.

Definition 4

We define \(\mathcal {R}_{ \textsf{Row} }\) as the set of tuples \(((\mathbb {F}_q, L, H, \rho ), (f_1, f_2, f_3))\) such that \(L, H \subseteq \mathbb {F}_q\) are disjoint affine subspaces, \(f_i \in \textsf{RS} _{{L}, {\rho }}\) for \(i \in \{1, 2, 3\}\) and the quadratic relationship \({{}\widehat{f}_1}_{|H} *{{}\widehat{f}_2}_{|H} = {{}\widehat{f}_3}_{|H}\) holds.

RS-encoded IOPPs \(( \textsf{P} _{ \textsf{Lin} _{}}, \textsf{V} _{ \textsf{Lin} _{}})\) and \(( \textsf{P} _ \textsf{Row} , \textsf{V} _ \textsf{Row} )\) for the two relations above are provided in [BFH+20, BCR+19] and in [AHIV17] up to minor adaptations in the second case. We will need a generalisation of \(\mathcal {R}_{ \textsf{Lin} _{}}\) that tests relations of the form \(M_1 \textbf{x}_1 + \ldots + M_h \textbf{x}_h = \textbf{b}\) (for \(h = 2\), \(M_1 = -I\) and \(\textbf{b} = \textbf{0}\) we get back the standard lincheck).

Definition 5

\(\mathcal {R}_{ \textsf{Lin} _{h}}\) is the set of tuples \(((\mathbb {F}_q, L, H_0, H_i, \rho , M_i, \textbf{b})_{i = 1}^h, (f_i)_{i = 1}^h)\) such that \(L, H_0, H_i \le \mathbb {F}_q\), \(L \cap H_0 = L \cap H_i = \varnothing \) for all \(i \in \{1, \ldots , h\}\), \(f_i \in \textsf{RS} _{{L}, {\rho }}\), \(M_i \in \mathbb {F}_q^{H_0 \times H_i}\) and the linear relationship \(\sum _{i = 1}^h M_i \cdot {{}\widehat{f}_i}_{|H_i} = \textbf{b}\) holds.

The lincheck protocol presented in Aurora can be generalised to capture this variant, as shown in the full version of this paper.

3 Simplified Construction

In the rest of the paper we aim at describing an efficient RS-encoded IOP for \(\mathbb {F}_2\)-R1CS. As the only tools we assume are a lincheck and a rowcheck over a large enough field, our first step in Sect. 3.1 is to characterise \(\mathbb {F}_2\)-R1CS in terms of one quadratic relation over \(\mathbb {F}_q\) and a set of linear relations modulo some vector space \(V \le \mathbb {F}_q\). An RS-encoded IOPP to test the latter conditions is provided in Sect. 3.3. Finally a simple solution that uses naively the above IOPP is provided in Section 3.4. Even if suboptimal, we see this as a useful stepping stone to better present the efficient version in Sect. 4.2.

3.1 Characterisation of R1CS

In the following we assume \((\varphi ,\psi )\) to be a \((k,e)_2\)-RMFE, where \(q=2^e\), and recall that \(\varPhi , \varPsi \) denote the block-wise application of \(\varphi \) and \(\psi \), cf. Sect. 2.1.

Theorem 1

Let \(A_1, A_2, A_3 \in \mathbb {F}_2^{m,n}\), \(\textbf{b} \in \mathbb {F}_2^m\) with mn multiples of k. Then there exists \(\textbf{w} \in \mathbb {F}_2^n\) such that \(((\mathbb {F}_2, m, n, A_1, A_2, A_3, \textbf{b}), \textbf{w}) \in \mathcal {R}_{ \textsf{R1CS} }\) if and only if there exist \({{}\widetilde{\textbf{w}}} \in \mathbb {F}_q^{n/k}\) and \({{}\widetilde{\textbf{x}}}_1, {{}\widetilde{\textbf{x}}}_2, {{}\widetilde{\textbf{x}}}_3, \textbf{t} \in \mathbb {F}_q^{m/k}\) satisfying

$$\begin{aligned} {{}\widetilde{\textbf{x}}}_1 *{{}\widetilde{\textbf{x}}}_2 = \textbf{t} \end{aligned}$$
(1)
$$\begin{aligned} {{}\widetilde{\textbf{w}}} = \textbf{0}&\mod ({\textrm{Im}}\,\varphi )^{n/k} \end{aligned}$$
(2)
$$\begin{aligned} {{}\widetilde{\textbf{x}}}_i = \textbf{0}&\mod ({\textrm{Im}}\,\varphi )^{m/k}&\forall i \in \{1, 2, 3\} \end{aligned}$$
(3)
$$\begin{aligned} {{}\widetilde{A}}_i {{}\widetilde{\textbf{w}}} - {{}\widetilde{I}}_m {{}\widetilde{\textbf{x}}}_i = \textbf{0}&\mod ({\textrm{Ker}}\,S \circ \psi )^{m}&\forall i \in \{1, 2, 3\} \end{aligned}$$
(4)
$$\begin{aligned} \textbf{t} - u {{}\widetilde{\textbf{x}}}_3 = u {{}\widetilde{\textbf{b}}}&\mod ({\textrm{Ker}}\,\psi )^{m/k} \end{aligned}$$
(5)

where \({{}\widetilde{\textbf{b}}} = \varPhi (\textbf{b}) \in \mathbb {F}_q^{m/k}\), \(u = \varphi (\textbf{1}_{k}) \in \mathbb {F}_q\), \({{}\widetilde{A}}_i \in \mathbb {F}_q^{m,n/k}\) is the matrix obtained by applying \(\varPhi \) row-wise to \(A_i\), and \({{}\widetilde{I}}_m \in \mathbb {F}_q^{m,m/k}\) is the matrix obtained by applying \(\varPhi \) row-wise to the identity matrix \(I_m \in \mathbb {F}_2^{m,m}\). Moreover if \(\textbf{w}\) is a witness for the R1CS then \({{}\widetilde{\textbf{w}}} = \varPhi (\textbf{w})\), \({{}\widetilde{\textbf{x}}}_i = \varPhi (A_i \textbf{w})\), \(\textbf{t} = {{}\widetilde{\textbf{x}}}_1 *{{}\widetilde{\textbf{x}}}_2\) satisfy the conditions above.

The proof appears in the full version [CG21], but we remark Eqs. (2), (3) are equivalent to saying \({{}\widetilde{\textbf{w}}} = \varPhi (\textbf{w})\), \({{}\widetilde{\textbf{x}}}_i = \varPhi (\textbf{x}_i)\) for some \(\textbf{w}\), \(\textbf{x}_i\); Eqs. (1) and (5) encode \(\textbf{x}_1*\textbf{x}_2=\textbf{x}_3+\textbf{b}\) (the rowcheck) and the latter is derived using properties (2) and (3) in Lemma 1; while Eqs. (4) encode \(A_i\textbf{w}=\textbf{x}_i\) (the lincheck) and are derived from property (4) in Lemma 1.

3.2 Linear Hashing

We now adapt linear checks to small fields. A common technique to test \(A \textbf{x} = \textbf{b}\) over \(\mathbb {F}_q\) is to sample a random vector \(\textbf{r} \in \mathbb {F}_q^m\) and check \(\textbf{r}^\top A \textbf{x} = \textbf{r}^\top \textbf{b}\). Alternatively one can set \(\textbf{r} = (1, r, \ldots , r^{m - 1})\) for \(r \leftarrow ^{\$}\mathbb {F}_q\) to save randomness. The soundness errors of these approaches are respectively 1/q and \((m-1)/q\), which are too large if q is small as in our case. Therefore they need to be adapted. With this aim in mind, let \(\vartheta : \mathbb {F}_2^\lambda \rightarrow \mathbb {F}_{2^\lambda }\) be an isomorphism of \(\mathbb {F}_2\)-linear spacesFootnote 8. For any \(\alpha \in \mathbb {F}_{2^\lambda }\) define \(R_{\alpha }^{(m)} : \mathbb {F}_2^{\lambda m} \rightarrow \mathbb {F}_2^{\lambda }\) such that

$$ R_{\alpha }^{(m)}(\textbf{x}_1, \ldots , \textbf{x}_m) \; = \; \vartheta ^{-1}\big ( \alpha \vartheta (\textbf{x}_1) + \ldots + \alpha ^m \vartheta (\textbf{x}_m) \big ). $$

Seeing this function as a matrix in \(\mathbb {F}_2^{\lambda , \lambda m}\), we can apply it to vectors in \(\mathbb {F}_q^{\lambda m}\), i.e., if \(R_{\alpha }^{(m)} = (r_{i,j}) \in \mathbb {F}_2^{\lambda , \lambda m}\) and \(\textbf{x} = (x_j)_{j = 1}^{\lambda m} \in \mathbb {F}_q^{\lambda m}\) then \(R_{\alpha }^{(m)} \textbf{x} = \left( \sum \nolimits _{j = 1}^{\lambda m} r_{i, j} x_j \right) _{i = 1}^\lambda \). This family of linear functions satisfies the following properties.

Proposition 1

Let \(V \le \mathbb {F}_q\) be an \(\mathbb {F}_2\) vector subspace, \(\textbf{y} \in \mathbb {F}_q^\lambda \), \(\textbf{x} \in \mathbb {F}_q^{\lambda m} \setminus V^{\lambda m}\) and \(\alpha \sim U(\mathbb {F}_{2^\lambda })\), then \( \Pr \left[ R_{\alpha }^{(m)} \textbf{x} = \textbf{y} \mod V^\lambda \right] \; \le \; 2^{-\lambda } \cdot m\)

Proposition 2

Let \(V \le \mathbb {F}_q\) be an \(\mathbb {F}_2\) vector subspace, \(\textbf{y} \in \mathbb {F}_q^\lambda \), \(\textbf{x}_i \in \mathbb {F}_q^{\lambda m_i}\) for \(i \in {[h]}\) such that \(\textbf{x}_j \notin V^{\lambda m_j}\) for some j. Then \(\alpha _i \sim U(\mathbb {F}_{2^\lambda })\) implies

$$ \Pr \left[ R_{\alpha _1}^{(m_1)} \textbf{x}_1 + \ldots + R_{\alpha _h}^{(m_h)} \textbf{x}_h = \textbf{y} \mod V^\lambda \right] \; \le \; 2^{-\lambda } \cdot \max \{m_i : i \in {[h]}\}. $$

3.3 Modular Lincheck

In this section we provide an RS-encoded IOPP that generalises the Lincheck to linear relations of the form \(M_1 \textbf{x}_1 + \ldots + M_h \textbf{x}_h = \textbf{b}\) modulo an \(\mathbb {F}_2\) vector space \(V \le \mathbb {F}_q\), where the verifier has oracle access to an encoding of \(\textbf{x}_i\) for each i.

Definition 6

The Modular Lincheck relation is the set \(\mathcal {R}_{ \textsf{Mlin} _h}\) of all tuples \(((\mathbb {F}_q, L, H_0, H_i, \rho , M_i, \textbf{b}, V)_{i = 1}^h, (f_i)_{i = 1}^h)\) such that \(L, H_0, H_i \subseteq \mathbb {F}_q\) are affine \(\mathbb {F}_2\)-spaces with \(L \cap H_i = \varnothing \), \(\rho \in [0,1)\), \(M_i \in \mathbb {F}_q^{H_0 \times H_i}\), \(f_i \in \textsf{RS} _{{L}, {\rho }}\) and \(\sum _{i = 1}^h M_i {{}\widehat{f}_i}_{|H_i} = \textbf{b} \mod V^{H_0}\).

Consider the simpler statement \(\textbf{x} = \textbf{0} \mod V^H\), i.e. \(\textbf{x} \in V^H\), and the following proof: the verifier samples a random \(R \sim U(\mathbb {F}_2^{H'_0 \times H})\), and receives \(\textbf{v} = R \textbf{x}\) from the prover; the verifier then checks \(\textbf{v} \in V^{H'_0}\) and then runs a lincheck to test \(\textbf{v} = R \textbf{x}\). In order to make this zero knowledge, we add a masking codeword g sampled from \( \textsf{Mask} (L, \rho , H'_0, V) = \{ f \in \textsf{RS} _{{L}, {\rho }} \, : \, {{}\widehat{f}}_{|H'_0} \in V^{H'_0} \}\) so that the sender first sends an oracle to g, receives R, and sends \(\textbf{v} = R \textbf{x} + {{}\widehat{g}}_{|H'_0}\) in plain. In the general case we replace \(\textbf{x}\) with \(\sum _{i = 1}^h M_i {{}\widehat{f}_i}_{|H_i} - \textbf{b}\) and, for efficiency reasons, the random matrix R with \(R_\alpha \) obtaining the protocol in Fig. 1.

Fig. 1.
figure 1

RS-encoded IOPP for \(\mathcal {R}_{ \textsf{Mlin} _h}\) with \( \textsf{pp} = (\mathbb {F}_q, L, H_0, (H_i)_{i = 1}^h, \rho )\)

From the above observations, the protocol has the following properties, where soundness comes from Proposition 2. See the full paper for a rigorous proof.

Theorem 2

Protocol 1 is an RS-encoded IOPP for the relation \(\mathcal {R}_{ \textsf{Mlin} _h}\) that upon setting \(|H'_0| = \lambda \) has the following parameters:

figure a

where \(H = \textsf{span} \left( H_1, \ldots , H_h, H'_0 \right) \) and \(T^ \textsf{P} _{ \textsf{Lin} _{h+1}}, T^ \textsf{V} _{ \textsf{Lin} _{h+1}}\) denotes the costs of running respectively \( \textsf{P} _{ \textsf{Lin} _{h+1}}\) and \( \textsf{V} _{ \textsf{Lin} _{h+1}}\).

3.4 An RS-Encoded IOP for R1CS from Modular Lincheck

Given RS-encoded IOPP for Modular Lincheck and Rowcheck we briefly sketch how to build a simple RS-encoded IOP for \(\mathbb {F}_2\)-R1CS. By Theorem 1 we know that a given system, defined by \(A_1, A_2, A_3 \in \mathbb {F}_2^{m, n}\), \(\textbf{b} \in \mathbb {F}_2^m\) is satisfied if and only if there exists \({{}\widetilde{\textbf{x}}}_1, {{}\widetilde{\textbf{x}}}_2, {{}\widetilde{\textbf{x}}}_3, \textbf{t} \in \mathbb {F}_q^{m/k}\) and \({{}\widetilde{\textbf{w}}}\in \mathbb {F}_q^{n/k}\) that satisfy Eqs. 1–5.

Thus we let the prover initially compute the extended witness \(\textbf{x}_i = A_i \textbf{w}\), apply block-wise the RMFE to get \({{}\widetilde{\textbf{x}}}_i = \varPhi (\textbf{x}_i)\), \({{}\widetilde{\textbf{w}}} = \varPhi (\textbf{w})\) and finally set \(\textbf{t} = {{}\widetilde{\textbf{x}}}_1 *{{}\widetilde{\textbf{x}}}_2\). Next, it picks two affine subspaces \(H_1, H_2 \subseteq \mathbb {F}_q\) of sizes m/kn/k and sample five codewords \(f_{{{}\widetilde{\textbf{x}}}_i}, f_{\textbf{t}}, f_{{{}\widetilde{\textbf{w}}}}\) such that \({{}\widehat{f}_{{{}\widetilde{\textbf{x}}}_i}}_{|H_1} = {{}\widetilde{\textbf{x}}}_i\), \({{}\widehat{f}_\textbf{t}}_{|H_1} = \textbf{t}\) and \({{}\widehat{f}_{{}\widetilde{\textbf{w}}}}_{|H_2} = {{}\widetilde{\textbf{w}}}\).

Finally it provides oracle access to these codewords to the verifier and they both run:

  • One rowcheck to test \({{}\widetilde{\textbf{x}}}_1 *{{}\widetilde{\textbf{x}}}_2 = \textbf{t}\).

  • Four modular lincheck to test \(I_{m/k} \cdot {{}\widetilde{\textbf{x}}}_i \in ({\textrm{Im}}\,\varphi )^{H_1}\) and \(I_{n/k} \cdot {{}\widetilde{\textbf{w}}} \in ({\textrm{Im}}\,\varphi )^{H_2}\).

  • Three modular lincheck to test that \(\tilde{A}_i \cdot {{}\widetilde{\textbf{w}}} - \tilde{I}_{m} \cdot {{}\widetilde{\textbf{x}}}_i \in ({\textrm{Ker}}\,S \circ \psi )^m\).

  • One modular lincheck to check \(I_{m/k} \cdot \textbf{t} - (u I_{m/k} ) \cdot {{}\widetilde{\textbf{x}}}_3 = u {{}\widetilde{\textbf{b}}} \mod ({\textrm{Ker}}\,\psi )^{H_1}\).

Correctness and soundness of the above protocol follows from Theorem 1, while Zero Knowledge against \(\beta \) queries can be achieved setting the rate of \(f_{{{}\widetilde{\textbf{x}}}_i}, f_\textbf{t}\) to \(\frac{m/k + \beta }{|L|}\) and the rate of \(f_{{}\widetilde{\textbf{w}}}\) to \(\frac{n/k + \beta }{|L|}\).

4 Efficient Construction

4.1 Batching Modular Linchecks and Packing Vectors

The protocol above requires a total of 8 modular Linchecks. In this section we show how to reduce the number of required modular linchecks to three, by batching proofs of relations modulo the same vector space: we aim at designing an RS-encoded IOPP for a relation of the form: \(\forall i \in {[h]}, A_i \textbf{x}_i = \textbf{b}_i \mod V^{m_i}\).

We propose the following: as before the prover begins by sending a codeword that encodes a masking term \(\textbf{y} \sim U(V^\lambda )\). The verifier then chooses h matrices \(R_{\alpha _1}, \ldots , R_{\alpha _h}\) and the prover replies by sending \(\textbf{v} = \sum \nolimits _{i = 1}^h R_{\alpha _i} (A_i \textbf{x}_i - \textbf{b}_i) + \textbf{y}.\) Finally the verifier checks if \(\textbf{v} \in V^\lambda \) and both parties executes a lincheck to test the above relation. Informally security follows as in the single modular lincheck from Sect. 3.3, except that for soundness we use Proposition 2.

To further improve the complexities, we now show how to reduce the size of vectors sent in plain by the prover in the (batched) modular lincheck. Recalling \(u = \varphi (\textbf{1}_{k})\) we point out \({\textrm{Ker}}\,\psi \) and \(u \cdot {\textrm{Im}}\,\varphi \) intersect only in 0, because \(\psi (u \cdot \varphi (\textbf{v})) = \textbf{1}_{k} *\textbf{v} = \textbf{v}\). Therefore \(\mathbb {F}_q\) is the direct sum of \({\textrm{Ker}}\,\psi \) and \((u \cdot {\textrm{Im}}\,\varphi )\). Then given \(\textbf{x} \in ({\textrm{Im}}\,\varphi )^n\) and \(\textbf{y} \in ({\textrm{Ker}}\,\psi )^n\), we just need to send \(\textbf{z} = u \textbf{x} + \textbf{y}\). Given \(\textbf{z}\) one can extract \(\textbf{x} = \varPhi (\varPsi (\textbf{z}))\) and \(\textbf{y} = \textbf{z} - u \textbf{x}\), where the former equation is justified by observing that, if we call \(\textbf{v} \in \mathbb {F}_2^{kn}\) such that \(\textbf{x} = \varPhi (\textbf{v})\), then \( \varPhi ( \varPsi (\textbf{z}) ) \; = \; \varPhi ( \varPsi ( u \textbf{x} + \textbf{y} ) ) \; = \; \varPhi ( \varPsi ( u \cdot \varPhi (\textbf{v}) ) ) \; = \; \varPhi ( \textbf{v} ) \; = \; \textbf{x}, \) where the second equality following from \(\textbf{y} \in ({\textrm{Ker}}\,\psi )^n\) and the third one from Lemma 1.

4.2 An Efficient RS-Encoded IOP for R1CS

With the two ideas presented so far we can now improve the protocol sketched in Sect. 3.4. We batch linchecks in three groups, testing equations modulo \({\textrm{Im}}\,\varphi \), \({\textrm{Ker}}\,S \circ \psi \) and \({\textrm{Ker}}\,\psi \) respectively. Moreover we observe that the masking terms of these tests can be aggregated. To do so we choose three disjoint affine subspaces \(H_1', H_2', H_3'\) of size \(\lambda \) and sample g from the set \( \textsf{BMask} \left( L, \rho , H_1', H_2', H_3', \varphi , \psi \right) \) defined as

$$ \left\{ f \in \textsf{RS} _{{L}, {\rho }} \, : \, {{}\widehat{f}}_{|H'_1} \in ({\textrm{Im}}\,\varphi )^{H'_1} \; , \; {{}\widehat{f}}_{|H'_2} \in ({\textrm{Ker}}\,S \circ \psi )^{H'_2} \; , \; {{}\widehat{f}}_{|H'_3} \in ({\textrm{Ker}}\,\psi )^{H'_3} \right\} . $$

In the following protocol we let \(\rho _1 = (m/k + \beta ) |L|^{-1}\), \(\rho _2 = (n/k + \beta ) |L|^{-1}\) and \(\rho _3 = (3\lambda + \beta ) |L|^{-1}\) be the three rates used (Fig. 2).

Fig. 2.
figure 2

RS-encoded IOP for R1CS. We fix a linear order on \(H_0, H_1, H_2\) and assume \({{}\widetilde{A}}_i \in \mathbb {F}_q^{H_0 \times H_2}, \; {{}\widetilde{I}}_m \in \mathbb {F}_q^{H_0 \times H_1}\). Note the first two steps can be precomputed knowing the input size, and that \(\textbf{v}_0, \textbf{v}_2\) are sent directly, i.e. without providing oracle access

Theorem 3

Protocol 2 is an RS-encoded IOP for the relation \(\mathcal {R}_{ \textsf{R1CS} }\) which, using Aurora’s lincheck and rowcheck, achieves the following parameters

figure b

Observe this means can take \(|L| \cdot \rho \approx \max (2m/k, 2n/k, 3\lambda ) + 2 \beta \) for a fixed rate \(\rho \approx 1/8\).

5 Comparisons

In this section we compare our construction with [AHIV17, BCR+19, GSV21, BFH+20] when proving satisfiability of an R1CS over \(\mathbb {F}_2\). In all cases we assume [BCS16] is used to compile IOP into NIZK. Our focus will be on the proof size, which we compute through a parameter optimiser, available at [Git21], based on [lib20], the open source implementation of Aurora and R1CS-Ligero. We also consider prover efficiency, which we only estimate theoretically. Regarding verifier time instead we do not expect significant improvements or overhead, as asymptotic costs are the same with roughly the same constants.

Aurora - Proof Size: Compiling Aurora [BCR+19] to a NIZK, proof size is dominated by the replies to oracle queries. Calling |L| the block length of the Reed Solomon code in use, each of these replies requires \(O(\log ^2|L|)\) hash values. As we use Reed Solomon codewords that encode vectors k times smaller w.r.t. Aurora with naïve embedding, the block length in our work is roughly k times smaller. We therefore estimate the proof size to be reduced by a term \(O(\log k \log |L|)\). Concrete proof sizes are shown in Fig. 3 where results on the left are obtained using proven soundness bounds, while on the right optimistic (but not proven) bounds are used, see the full version for more details. The improvement factor for \(2^{20}\) constraints with a \((48, 192)_2\)-RMFE and 128 security bits amounts in the first case to 1.65, in the second case to 1.31.

Fig. 3.
figure 3

Argument Size w.r.t. the number of constraints for 128 security bit for: Aurora with proven soundness bounds (up, left) and with optimistic bounds (up, right), Ligero/BooLigero with interactive repetitions and smaller fields (down, left) and Ligero++ (down, right). Our work uses a \((48, 192)_2\)-RMFE in the first two cases, and a \((48, 160)_2\)-RMFE for the others.

Aurora - Prover Time: Using again the fact that the block length is reduced by a factor of k with a \((k, e)_2\)-RMFE observe that

  • In the RS-encoded IOP, the cost is dominated by the \(18 \cdot \textrm{FFT}(\mathbb {F}_q, |L|)\). In our case we perform 35 fast Fourier transforms over a set k times smaller, leading to an improvement factor of 18k/35.

  • In the low degree test, prover complexity is upper bounded by 6|L| arithmetic operations [BBHR18a]. Hence our construction improves by a factor k.

  • In the BCS transform, computing the Merkle tree from an oracle of size |L| requires \(2|L| - 1\) hashes. Using column hashing our construction requires the same amount of trees as in plain Aurora. Moreover, calling \(f_i\) FRI’s i-th oracle, the length of \(f_i\) is \(|L| \cdot 2^{-i\eta }\) for a constant \(\eta \), i.e. it scales linearly in |L|. Therefore our protocol requires k times less hash function evaluations.

In conclusion, we estimate that deploying a \((48, 192)_2\)-RMFE leads to a \(18k/35 \approx 24.7 \times \) speed up asymptotically.

Ligero and BooLigero - Proof Size: Applying our construction to R1CS-Ligero [BCR+19], whose proof size is \(\varTheta (\sqrt{n})\), over a field \(\mathbb {F}_{2^{160}}\) we can obtain shorter proof by a factor \(\sqrt{k} \approx 6.9\) as we would invoke every sub-protocol on input k times shorter. However in [AHIV17] an optimisation through interactive repetitions working over smaller fields is presented. As this version is harder to analyse asymptotically, we estimate its cost comparing it with BooLigero and our construction using a \((48, 160)_2\)-RMFE (Fig. 3, down left).

Ligero and BooLigero - Prover Time: For simplicity we only compare our construction to Ligero without repetitions, as in this case operations are performed over the same extension of \(\mathbb {F}_2\), for a R1CS over \(\mathbb {F}_2\) with n variables and n constraints. Recall that \(|L| = \varTheta (\sqrt{n})\) and each vector is divided in m blocks of length \(\ell \), both growing asymptotically as \(\sqrt{n}\). As in Aurora we split the prover time in three terms:

  • In the IOP, costs are dominated asymptotically by \(21 m \cdot \textrm{FFT}(\mathbb {F}_q, |L|)\). In our cases we would need \(31 m'\) fast Fourier transform but with \(m' \sim m/ \sqrt{k}\) and over a set \(\sqrt{k}\) times smaller, leading to an improvement factor of 21k/31

  • As Ligero performs a direct low degree test no extra computation is performed for testing proximity

  • In the BCS transform, using column hashing only one tree with \(2|L| - 1\) nodes has to be computed. Hence in our construction this step is performed \(\sqrt{k}\) times faster.

In conclusion we expect an improvement factor between 6.9–32.5 with a \((48, 160)_2\)-RMFE. We leave prover time comparison with the more efficient version of Ligero that allows repetitions as future work.

Ligero++: As [BFH+20] combines Ligero with an inner product argument, which can be realised adapting Aurora’s sumcheck to achieve poly-logarithmic argument size, we expect a prover time reduction comparable to those in plain Ligero and Aurora. The same applies to the proof size that, for completeness, we also estimate through our parameter optimiser, Fig. 3, achieving a median improvement factor of \(1.26 \times \).