1 Introduction

Background. A key feature of an interactive proof is soundness, which requires that the verifier will not accept a false statement, i.e., an instance x that is not in the considered language, except with bounded probability. In many situations however, a stronger notion of soundness is needed: knowledge soundness. Informally, knowledge soundness requires the prover to know a witness w that certifies that x is a true statement, in order for the verifier to accept (except with bounded probability). More formally, this is captured by the existence of an efficient extractor, which has (rewindable) oracle access to any, possibly dishonest, prover, and which outputs a witness w for the considered statement x with a probability that is tightly related to the probability of the prover making the verifier accept.

Since their introduction, interactive proofs that satisfy knowledge soundness, typically referred to proofs of knowledge then, have found a myriad of applications. However, showing that an interactive proof satisfies knowledge soundness is typically non-trivial — often significantly more involved than showing ordinary soundness. By default, it involves designing the extractor, and proving that it “does the job.” We got spoiled in the past, where most of the considered interactive proofs were \(\varSigma \)-protocols, i.e., public-coin 3-round interactive proofs, and had the additional property of being special-sound. Indeed, this made life rather easy since special-soundness is a property that is usually quite easy to prove, and that implies ordinary and knowledge soundness via a general classical result. Thus, knowledge soundness was often obtained (almost) for free. However, this has changed in recent years, where the focus has shifted towards finding highly efficient interactive proofs (where efficiency is typically measured via the communication complexity, verification time, etc.); many of these highly efficient solutions are not special-sound, and thus require a knowledge-soundness proof from scratch.

Given this situation, it would be desirable to have stronger versions of the generic “special-soundness \(\Rightarrow \) knowledge soundness” result that applies to a weaker notion of special-soundness, which then hopefully is satisfied by these new cutting-edge interactive proofs. One step in this direction was recently made in [2, 3], where the above implication was extended to k-special-sound interactive proofs, and, even more generally, to \((k_1,\ldots ,k_\mu )\)-special-sound multi-round public-coin interactive proofs, for arbitrary positive integer parameters, subject to being suitably bounded from above (e.g., \(k \le \textrm{poly}(|x|)\)). Rather naturally, k-special-soundness means that from accepting responses to k pairwise distinct challenges for one fixed message, a witness can be efficiently computed (so that 2-special-soundness coincides with the classical special-soundness property); for the multi-round version, a suitable tree of transcripts is needed for computing a witness. This weaker notion of special-soundness is in particular sufficient to analyze Bulletproofs-like protocols, and so we directly obtain knowledge soundness for these protocols.

However, this weaker notion still falls short of capturing many of the recent highly-efficient interactive proofs. For instance, a commonly used amortization technique, where the prover proves a random linear combination of n statements (instead of proving all the statements individually), requires correct responses for n linearly independent challenge vectors in order to compute a witness. Another example comes from the design principle to first construct a highly efficient probabilistically checkable proof (PCP) or interactive oracle proof (IOP), and then to compile it into a standard (public-coin) interactive proof in the natural way by means of a Merkle-tree commitment [11,12,13]. Also here, one does not obtain a special-sound protocol in the above generalized sense (or then only for a too large parameter); instead, one requires challenges that correspond to sets whose union covers all (or sufficiently many of) the leaves of the Merkle tree, in order to obtain a witness.

Our Technical Results. In this paper, we push the weakening of the special-soundness property to the extreme. For \(\varSigma \)-protocols, in the spirit of ordinary or k-special-soundness, the notion of special-soundness that we consider in this work requires that a witness can be efficiently computed from accepting responses to sufficiently many pairwise distinct challenges, but now “sufficiently many” is captured by an arbitrary monotone (access) structure \(\varGamma \), i.e., an arbitrary monotone set of subsets of the challenge set. This gives rise to the notion of \(\varGamma \)-special-soundness, which coincides with k-special-soundness in the special case where \(\varGamma \) is the threshold access structure with threshold k. This naturally extends to multi-round public-coin interactive proofs, leading to the notion of \((\varGamma _1,\ldots ,\varGamma _\mu )\)-special-soundness. Similar notions were considered in [9, 10] in the setting of commit-and-open \(\varSigma \)-protocols, and in some more constrained form, where the monotone structures are replaced by matroids, in [14, 15].

We cannot expect for every \(\varGamma \) that a \(\varGamma \)-special-sound protocol is a proof of knowledge. Instead, we identify parameters \(t_\varGamma \) and \(\kappa _\varGamma \), determined by the structure \(\varGamma \), and for any \(\varGamma \)-special-sound \(\varSigma \)-protocol we prove existence of an extractor that has an knowledge error \(\kappa _\varGamma \) and an expected running time that scales with \(t_\varGamma \). Thus, as long as \(t_\varGamma \le \textrm{poly}(|x|)\), \(\varGamma \)-special-soundness implies knowledge soundness. Similarly for \((\varGamma _1,\ldots ,\varGamma _\mu )\)-special-sound multi-round protocols.

The construction of our extractor for \(\varGamma \)-special-sound protocols (and its multi-round generalization) is inspired by the extractor construction from [3]. As a nice consequence, we can recycle the line of reasoning from [3] to prove strong parallel repetition and extend it to our general notion of special-soundness, showing that also here the knowledge error of a parallel repetition decreases exponentially with the number of repetitions. For this result, we refer to the full version [1].

Applications. Our general technique gives immediate, tight results for simple but important example protocols. For example, applied to the above mentioned amortization technique of proving a random linear combination, we directly obtain knowledge extraction with a knowledge error that matches the trivial cheating probability. Similarly, applied to the natural interactive proof for a Merkle commitment, where the prover is challenged to open a random subset (of a certain size), we obtain a knowledge error that matches the probability of one faulty node not being opened.

In order to demonstrate the usefulness of our result beyond the above simple examples, we analyze the (interactive) FRI protocol [5].Footnote 1 We prove that for a certain range of parameters, when instantiated with a Merkle tree commitment using a collision resistant hash function (or with any non-interactive, computationally binding vector commitment scheme with local openings), the protocol admits a knowledge extractor with knowledge error essentially matching the trivial cheating probability with the following caveat: the knowledge extractor runs only in (expected) quasi-polynomial time. (At least, this is true if the protocol is run for logarithmically many rounds, as is typically done. For a natural constant-round variant, which requires more total communication, we can obtain nearly optimal knowledge soundness, i.e., here the knowledge extractor runs in expected polynomial time.) In more detail, for any proximity parameter \(\delta \) up to \(\delta < \frac{1-\rho }{4}\), where \(\rho \) is the relative rate of the considered code, we establish the existence a knowledge extractor running in expected time \(N^{O(\log N)}\) which, when given oracle access to a (potentially dishonest) prover \(\mathcal {P}^*\), succeeds with probability at least \(\epsilon (\mathcal {P}^*) - ((1-\delta )^t + O(N/|\mathbb {F}|))\), where N is the length of the code, t is the number of repetitions of a certain verification step, and \(\epsilon (\mathcal {P}^*)\) is the probability \(\mathcal {P}^*\) convinces the verifier to accept.Footnote 2 For context, the trivial cheating probability for the protocol is \(\max \{(1-\delta )^t,1/|\mathbb {F}|\}\). In contrast to the above simple examples, arguing that the FRI protocol is \((\varGamma _1,\ldots ,\varGamma _\mu )\)-special-soundness is not trivial; however, technical results from [5] can be recycled in order to show this, and then the existence of the knowledge extractor follows immediately from our generic result. While proving the existence of a quasi-polynomial time extractor does not suffice for establishing the standard notion of knowledge soundness, we believe that it still offers a nontrivial guarantee with the potential for practical relevance.

A final example, which we would like to briefly discuss, is parallel repetition. This example shows that our generic technique does not always work. For simplicity, consider a k-special-sound \(\varSigma \)-protocol with \(k > 2\) (but the discussion also applies to multi-round protocols, and to our generalized notion of special soundness). Then, its t-fold parallel repetition is not k-special-sound anymore (unless \(k=2\)). One can argue that it is \(\big ((k-1)^t+1\big )\)-special-sound — but this parameter is exponential in t, and thus one cannot directly conclude knowledge soundness. On the other hand, equipped with our generalized notion, one can observe that the parallel repetition is \(\varGamma \)-special-sound for \(\varGamma \) being the structure that accepts a list of challenge vectors, each vector of length t, if there is one position where the challenge vectors feature at least k different values. Unfortunately, also here, the crucial parameter \(t_\varGamma \) turns out to be exponential for this structure \(\varGamma \), and so our generic result does not imply knowledge soundness. Fortunately, for this particular and important example, the parallel repetition result from [3] applies in case of k-special-sound protocols (and its multi-round generalization), and our extension (see the full version [1]) of the parallel repetition applies in case of arbitrary \((\varGamma _1,\ldots ,\varGamma _\mu )\)-special-sound protocols. Thus, after all, we can still argue (optimal) knowledge soundness in this case.

In conclusion, we expect that with our generic result for \((\varGamma _1,\ldots ,\varGamma _\mu )\)-special-sound protocols (which requires control over certain parameters to be applicable), and with our general parallel repetition result, our work offers powerful tools for proving knowledge soundness of many sophisticated proofs of knowledge.

2 Preliminaries

We write \(\mathbb {N}_{0} = \mathbb {N}\cup \{0\}\) for the set of nonnegative integers. Further, for any \(q\in \mathbb {Z}\), \(\mathbb {Z}_q = \mathbb {Z}/q\mathbb {Z}\) denotes the ring of integers modulo q.

2.1 Interactive Proofs

Let us now introduce some standard terminology and definitions with respect to interactive proofs. We follow standard conventions as presented in [4].

Let \(R \subseteq \{0,1\}^* \times \{0,1\}^*\) be a binary relation, containing statement-witness pairs (xw). We assume all relations to be NP-relations, i.e., verifying that \((x;w) \in R\) takes time polynomial in \(\left|x\right|\). An interactive proof for a relation R aims to allow a prover \(\mathcal {P}\) to convince a verifier \(\mathcal {V}\) that a public statement x admits a (secret) witness w, i.e., \((x;w)\in R\), or even that the knows a witness w for x.

An interactive proof with three communication rounds, where we may assume the prover to send the first and final message, is called a \(\varSigma \)-protocol. Further, an interactive proof is said to be public-coin if the verifier publishes all its random coins. In this case, we may assume all the verifier’s messages to be sampled uniformly at random from finite (challenge) sets.

An interactive proof is said to be complete if for any statement witness pair (xw) an honest execution results in an accepting transcript (with high probability). It is sound if a dishonest prover cannot convince an honest verifier on public inputs x that do not admit a witness w, i.e., on false statements x. More precisely, \((\mathcal {P},\mathcal {V})\) is sound if \(\mathcal {V}\) rejects false statements x with high probability. The stronger notion of knowledge soundness requires that (potentially dishonest) provers that succeed in convincing the verifier with large enough probability must actually “know” a witness w. We will mainly be interested in analyzing the knowledge soundness of interactive proofs. For this reason, we formally define this property below.

Definition 1

(Knowledge Soundness). An interactive proof \((\mathcal {P},\mathcal {V})\) for relation R is knowledge sound with knowledge error \(\kappa :\mathbb {N}\rightarrow [0,1]\) if there exists a positive polynomial q and an algorithm \(\mathcal {E}\), called a knowledge extractor, with the following properties. Given input x and black-box oracle access to a (potentially dishonest) prover \(\mathcal {P}^*\), the extractor \(\mathcal {E}\) runs in an expected number of steps that is polynomial in |x| (counting queries to \(\mathcal {P}^*\) as a single step) and outputs a witness \(w \in R(x)\) with probability

$$ \Pr \bigl ((x;\mathcal {E}^{\mathcal {P}^*}(x)) \in R \bigr ) \ge \frac{\epsilon (\mathcal {P}^*,x)-\kappa (|x|)}{q(|x|)} , $$

where \(\epsilon (\mathcal {P}^*,x) := \Pr \bigl ( (\mathcal {P}^*,\mathcal {V})(x) = \textsf {accept}\bigr )\) is the success probability of \(\mathcal {P}^*\) on public input x.

Remark 1

(Interactive Arguments). In some cases, soundness and knowledge soundness only hold with respect to computationally bounded provers, i.e., unbounded provers can falsely convince a verifier. Computationally (knowledge) sound protocols are referred to as interactive arguments. Proving soundness of interactive arguments can be significantly more complicated than proving soundness of interactive proofs. However, in the context of knowledge soundness, an interactive argument for relation R can oftentimes be cast as an interactive proof for a modified relation

$$ R' = \{ (x;w) : (x;w) \in R \text { or } w \text { solves some computational problem}\}\,. $$

Hence, in this case the knowledge extractor will either output a witness w with respect to the original relation w, or it will output the solution to some computational problem, e.g., a discrete logarithm relation. In fact, our analysis of the FRI protocol in Sect. 7 exemplifies this general principle. For this reason, knowledge soundness of interactive arguments can typically be analyzed via knowledge extractors that are originally defined for interactive proofs. Therefore, we will focus on the analyzes of interactive proofs.

Proving knowledge soundness of \(\varSigma \)-protocols directly is a nontrivial task, as it requires the construction of an efficient knowledge extractor. It is typically much easier to prove a related threshold special-soundness property, which states that a witness can be extracted from a sufficiently large set of colliding and accepting transcripts.

Definition 2

(k-out-of-N Special-Soundness). Let \(k,N \in \mathbb {N}\). A 3-round public-coin interactive proof \(\varPi = (\mathcal {P},\mathcal {V})\) for relation R, with challenge set of cardinality \(N\ge k\), is k-out-of-N special-sound if there exists an algorithm that, on input a statement x and k accepting transcripts \((a,c_1,z_1), \dots (a,c_k,z_k)\) with common first message a and pairwise distinct challenges \(c_1,\dots ,c_k\), runs in polynomial time and outputs a witness w such that \((x;w) \in R\). We also say \(\varPi \) is k-special-sound and, if \(k=2\), it is simply said to be special-sound.

It is known that k-out-of-N special-soundness implies knowledge soundness with knowledge error \((k-1)/N\). Recently, the multi-round generalization \((k_1,\dots ,k_\mu )\)-out-of-\((N_1,\dots ,N_\mu )\) special-soundness has become relevant. It is now known that also this generalization tightly implies knowledge soundness [2]. For a formal definition, we refer either to [2] or to Sect. 6 where we generalize this (multi-round) notion beyond the threshold setting.

2.2 Geometric Distribution

This work adapts the extractor of [3]. For this reason, we also need the following preliminaries on the geometric distribution from their work.

A random variable B with two possible outcomes, denoted 0 (failure) and 1 (success), is said to follow a Bernoulli distribution with parameter p if \(p=\Pr ( B = 1 )\). Sampling from a Bernoulli distribution is also referred to as running a Bernoulli trial. The probability distribution of the number X of independent and identical Bernoulli trials needed to obtain a success is called the geometric distribution with parameter \(p = \Pr (X=1)\). In this case \(\Pr (X=k) = (1-p)^{k-1} p\) for all \(k \in \mathbb {N}\) and we write \(X \sim {\text {Geo}}(p)\). For two independent geometric distributions we have the following lemma.

Lemma 1

Let \(X \sim {\text {Geo}}(p)\) and \(Y \sim {\text {Geo}}(q)\) be independently distributed. Then,

$$ \Pr (X \le Y) = \frac{p}{p+q-pq} \ge \frac{p}{p+q} \, . $$

3 A Generalized Notion of Special-Soundness for \(\varSigma \)-Protocols

In this section, we define a generalized notion of special-soundness. To this end, we first recall the definition of monotone structures.

Definition 3

(Monotone Structure). Let \(\mathcal {C}\) be a nonempty finite set and let \(\varGamma \subseteq 2^\mathcal {C}\) be a family of subsets of \(\mathcal {C}\). Then, \(\varGamma \) or \((\varGamma ,\mathcal {C})\) is said to be a monotone structure if it is closed under taking supersets, i.e., \(S \in \varGamma \) and \(S \subseteq T \subseteq \mathcal {C}\) implies \(T \in \varGamma \).

In some textbooks monotone structures \(\varGamma \) do not contain the empty set \(\emptyset \) by definition, which is equivalent to \(\varGamma \ne 2^\mathcal {C}\), and they are required to be nonempty, which is equivalent to \(\mathcal {C}\in \varGamma \). For convenience, we also consider \(\varGamma = \emptyset \) and \(\varGamma = 2^\mathcal {C}\) to be monotone structures. Then, for any \(\mathcal {D}\subseteq \mathcal {C}\), the restriction

$$ \varGamma |_{\mathcal {D}} = \{ S \subseteq \mathcal {D}: S \in \varGamma \} \subseteq 2^{\mathcal {D}} $$

defines a monotone structure \((\varGamma |_{ \mathcal {D}},\mathcal {D})\).

Definition 4

(Minimal Set). Let \((\varGamma ,\mathcal {C})\) be a monotone structure. A set \(S \in \varGamma \) is minimal if none of its proper subsets are in \(\varGamma \), i.e., for all \(T \subsetneq S\) it holds that \(T \notin \varGamma \). Further, \(M(\varGamma )\subseteq \varGamma \) denotes the set of minimal elements of \(\varGamma \).

Definition 5

(Distance to a Monotone Structure). For a nonempty monotone structure \((\varGamma ,\mathcal {C})\), we define the following distance function:

$$ d_\varGamma :2^\mathcal {C}\rightarrow \mathbb {N}_{0} \,, \quad S \mapsto \min _{T \in \varGamma } \,\left|T \setminus S\right| \,. $$

Equivalently,

$$ d_\varGamma :2^\mathcal {C}\rightarrow \mathbb {N}_{0} \,, \quad S \mapsto \min _{T \subseteq \mathcal {C}} \{ \left|T\right| : S \cup T \in \varGamma \} \,. $$

If \(\varGamma = \emptyset \), we define \(d_\varGamma \) to be identically equal to \(\infty \).

The value \(d_\varGamma (S) \in \mathbb {N}_{0}\) equals the minimum number of elements that have to be added to the set S to obtain an element of \(\varGamma \). In particular, \(d_\varGamma (S) = 0\) if and only if \(S \in \varGamma \). Hence, it shows how close S is to the monotone structure \(\varGamma \).

The key observation is now that typical knowledge extractors for interactive proofs proceed by extracting some set of accepting transcripts from a dishonest prover attacking the interactive proof. Subsequently, the knowledge extractor computes a witness from this set of accepting transcripts. Clearly, the set of sets of accepting transcripts from which a witness can be computed is closed under taking supersets, i.e., it is a monotone structure. Therefore, the following special-soundness notion for 3-round \(\varSigma \)-protocols follows naturally.

Definition 6

(\(\varGamma \)-out-of-\(\mathcal {C}\) Special-Soundness). Let \((\varGamma ,\mathcal {C})\) be a monotone structure. A 3-round public-coin interactive proof \((\mathcal {P},\mathcal {V})\) for relation R, with challenge set \(\mathcal {C}\), is \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound if there exists an algorithm that, on input a statement x and a set of accepting transcripts \((a,c_1,z_1), \dots , (a,c_k,z_k)\) with common first message a and such that \({\{c_1,\dots ,c_k\} \in \varGamma }\), runs in polynomial time and outputs a witness \(w \in R(x)\). We also say \((\mathcal {P},\mathcal {V})\) is \(\varGamma \)-special-sound.

The above definition is a generalization of k-out-of-N special-soundness, where the extractability is guaranteed when given k colliding accepting transcripts with common first message a and pairwise distinct challenges \(c_i\) that are elements of a challenge set with cardinality N. Hence, when \(\varGamma \) contains all sets of cardinality at least k, i.e., it is a threshold monotone structure, \(\varGamma \)-out-of-\(\mathcal {C}\) special-soundness reduces to k-out-of-N special-soundness, where \(N = \left|\mathcal {C}\right|\).

Remark 2

Formally, the monotone structure \((\varGamma ,\mathcal {C})\) of Definition 6 may depend on the size \(\left|x\right|\) of the public input x, i.e., it should actually be replaced by an ensemble \((\varGamma _{\lambda },\mathcal {C}_\lambda )\) of monotone structures indexed by the size \(\lambda \in \mathbb {N}\) of the public input of \((\mathcal {P},\mathcal {V})\). For simplicity, we will abuse notation by ignoring this dependency and simply writing \((\varGamma ,\mathcal {C})\).

4 Knowledge Extraction for \(\varGamma \)-out-of-\(\mathcal {C}\) Special-Sound \(\varSigma \)-Protocols

Our goal is to prove that, for certain monotone structures \((\varGamma ,\mathcal {C})\), \(\varGamma \)-out-of-\(\mathcal {C}\) special-soundness (tightly) implies knowledge soundness, and to determine the corresponding knowledge error. In order to prove this, we construct a knowledge extractor that, by querying a prover \(\mathcal {P}^*\) attacking the interactive proof, obtains a set of accepting transcripts with common first message and for which the challenges form a set in \(\varGamma \). Without loss of generality we may assume \(\mathcal {P}^*\) to be deterministic,Footnote 3 i.e., \(\mathcal {P}^*\) always outputs the same first message a. Hence, \(\mathcal {P}^*\) can be viewed as a (deterministic) function

$$ \mathcal {P}^* :\mathcal {C}\rightarrow \{0,1\}^*\, \quad c \mapsto y=(a,c,z) \,, $$

that on input a challenge \(c\in \mathcal {C}\) outputs a protocol transcript \(y=(a,c,z)\).

Let \(A \subseteq \mathcal {C}\) be the set of challenges for which \(\mathcal {P}^*\) succeeds, i.e., \(A = \{c \in \mathcal {C}: V(\mathcal {P}^*(c))=1 \}\). Then the goal of the extractor is to find a set \(B \in \varGamma |_{A}\). The difficulty is that the extractor is only given oracle access to \(\mathcal {P}^*\) and therefore does not know the set A. For this reason, extractors typically proceed recursively as follows: if at some point the extractor has found some \(S \subseteq A\) with \(S \notin \varGamma \), it will try new challenges \(c \in \mathcal {C}\) until \(\mathcal {P}^*\) succeeds. The hope is then that \(S \cup \{c\} \subseteq A\) is “closer” to \(\varGamma |_{A}\) than S. More precisely, the extractor tries to find a \(c \in A \subseteq \mathcal {C}\) such that \(d_{\varGamma |_{A}}(S \cup \{c\}) < d_{\varGamma |_{A}}(S)\). Note that not all challenges c shorten the distance to \(\varGamma |_{A}\), e.g., \(d_{\varGamma |_{A}}(S \cup \{c\}) = d_{\varGamma |_{A}}(S)\) for all \(c \in S\). Since the extractor does not know the set A, it cannot evaluate this distance function.

However, for any S, the challenge set \(\mathcal {C}\) can be partitioned into a partition of “useless” challenges and a partition of “potentially useful” challenges. The useless challenges are the \(c\in \mathcal {C}\) such that \(d_{\varGamma |_{A}}(S \cup \{c\}) = d_{\varGamma |_{A}}(S)\) for all \(A \subseteq \mathcal {C}\) containing S, i.e., for all A useless challenges will not shorten the distance to \(\varGamma |_{A}\). For instance, all \(c \in S\) are useless challenges for any S and any \(\varGamma \). However, in some settings the set of useless challenges is larger than S, and in general this observation is crucial for the extractor to be efficient. In fact, this is the case for all interactive proofs that warrant a generalization of the existing threshold special-soundness notion. All challenges \(c \in \mathcal {C}\) that are not useless are potentially useful, i.e., for these challenges there exist an \(A \subseteq \mathcal {C}\) containing S such that \(d_{\varGamma |_{A}}(S \cup \{c\}) < d_{\varGamma |_{A}}(S)\). The set of useful challenges is denoted \(U_\varGamma (S)\), where the function \(U_\varGamma \) is formally defined below.

Definition 7

(Useful Elements). For a monotone structure \((\varGamma ,\mathcal {C})\), we define the following function:

$$ U_{\varGamma } :2^\mathcal {C}\rightarrow 2^\mathcal {C}, \quad S \mapsto \bigl \{ c \in \mathcal {C}\setminus S : \begin{array}{c} \exists A \in \varGamma \text { s.t. } S \subset A \,\wedge \, A \setminus \{c\} \notin \varGamma \end{array} \bigr \} \,. $$

Note that \(\varGamma = \emptyset \) implies \(U_\varGamma (S) = \emptyset \) for all \(S\subseteq \mathcal {C}\). Moreover, if \(\varGamma \) is nonempty, \(U_\varGamma (S) = \emptyset \) if and only if \(S \in \varGamma \).

The following lemma shows that for any \(c\in U_\varGamma (S)\), there exists an \(A \in \varGamma \) containing \(S \cup \{c\}\) such that

$$ d_{\varGamma |_{A}}(S \cup \{c\}) < d_{\varGamma |_{A}}(S)\,, $$

i.e., the challenges \(c \in U_\varGamma (S)\) are indeed potentially useful to the extractor. Even more so, it is essential that the extractor considers all challenges \(c \in U_\varGamma (S)\). For every \(c \in U_\varGamma (S)\), it might namely be the case that the \(A\in \varGamma \) that “certifies” c, i.e., the A such that \(S \subset A\) and \(A \setminus \{c\} \notin \varGamma \), corresponds to the challenges for which the prover \(\mathcal {P}^*\) succeeds. Since \(A \setminus \{c\} \notin \varGamma \), the extractor can only succeed if it considers the challenge \(c\in U_\varGamma (S)\) at some point.

The same lemma shows that challenges \(c \notin U_\varGamma (S)\) will never decrease the distance, i.e., they are indeed useless to the extractor. More precisely, if \(c \notin U_\varGamma (S)\), for every \(A \in \varGamma \) containing \(S \cup \{c\}\) it holds that

$$ d_{\varGamma |_{A}}(S \cup \{c\}) = d_{\varGamma |_{A}}(S)\,. $$

Lemma 2

Let \((\varGamma ,\mathcal {C})\) be a monotone structure and \(S \subset \mathcal {C}\). Then \(c \in U_\varGamma (S)\) if and only if there exists an \(A \in \varGamma \) containing \(S \cup \{c\}\) such that

$$ d_{\varGamma |_{A}}(S \cup \{c\}) < d_{\varGamma |_{A}}(S)\,. $$

For the proof of Lemma 2, we refer the to the full version [1].

We also derive the following lemma, which shows that even if all useless challenges \(c \in \mathcal {C}\setminus U_\varGamma (S)\) are added to the set \(S \in 2^\mathcal {C}\setminus \varGamma \), the resulting subset is still not in \(\varGamma \).

Lemma 3

Let \((\varGamma ,\mathcal {C})\) be a monotone structure and \(S \in 2^\mathcal {C}\setminus \varGamma \). Then, \({(\mathcal {C}\setminus U_\varGamma (S) ) \cup S \notin \varGamma }\).

For the proof of Lemma 3, we refer the to the full version [1].

The knowledge extractor will be restricted to sampling challenges that are potentially useful. The value \(t_\varGamma \) defines the maximum number of accepting transcripts that the extractor has to find, before it succeeds and obtains the accepting transcripts for a set \(S\in \varGamma \). The efficiency of our knowledge extractor will depend on \(t_\varGamma \). A formal definition is given below. Further, in Sect. 5, we describe the monotone structure and corresponding k-values for three (classes of) interactive proofs and explain their relevance.

Definition 8

(t-value). Let \((\varGamma ,\mathcal {C})\) be a monotone structure and \(S \subseteq \mathcal {C}\). Then

$$ t_\varGamma (S) := \max \Biggl \{ t \in \mathbb {N}_{0} : \begin{array}{c} \exists c_1,\dots ,c_t \in \mathcal {C}\text { s.t. } \\ c_i \in U_\varGamma \bigl (S \cup \{c_1,\dots ,c_{i-1}\}\bigr ) \,\, \forall i \end{array} \Biggr \}\,. $$

Further,

$$ t_\varGamma := t_\varGamma (\emptyset ) \,. $$

It is easily seen that \(t_\varGamma (S) = 0\) if and only if \(S \in \varGamma \) or \(\varGamma = \emptyset \). Further, the following lemma shows that adding an element \(c \in U_{\varGamma }(S)\) to S decreases the corresponding k-value. This lemma plays a pivotal role in our recursive extraction algorithm.

Lemma 4

Let \((\varGamma ,\mathcal {C})\) be a nonempty monotone structure and let \(S \subseteq \mathcal {C}\) such that \(S \notin \varGamma \). Then, for all \(c \in U_{\varGamma }(S)\),

$$ t_{\varGamma }(S\cup \{c\}) < t_{\varGamma }(S) \,. $$

For the proof of Lemma 4, we refer the to the full version [1].

As in [3], we describe our technical results in a more abstract language. This will later allow us to easily derive composition results and handle more complicated scenarios, such as multi-round interactive proofs and parallel compositions. To this end, let us consider a finite set \(\mathcal {C}\), a probabilistic algorithm \(\mathcal {A}:\mathcal {C}\rightarrow \{0,1\}^*\) and a verification function \(V :\mathcal {C}\times \{0,1\}^* \rightarrow \{0,1\}\). An output \(y \leftarrow \mathcal {A}(c)\) of the algorithm \(\mathcal {A}\) on input \(c \in \mathcal {C}\) is said to be accepting or correct if \(V(c,y)=1\). The success probability of \(\mathcal {A}\) is denoted as

$$ \epsilon (\mathcal {A}) := \Pr \bigl ( V\bigl (C, \mathcal {A}(C) \bigr ) = 1 \bigr ) \,, $$

where C is uniformly random in \(\mathcal {C}\). The obvious instantiation of \(\mathcal {A}\) is given by a deterministic dishonest prover \(\mathcal {P}^*\) attacking an interactive proof \(\varPi \) on input x. Note that even though it is sufficient to consider deterministic provers \(\mathcal {P}^*\), we allow the algorithm \(\mathcal {A}\) to be probabilistic. This generalization is essential when considering multiround interactive proofs and parallel repetitions [3].

Now let \(\varGamma \subseteq 2^\mathcal {C}\) be a nonempty monotone structure. Then, for any \(S \subset \mathcal {C}\) with \(U_\varGamma (S)\ne \emptyset \), we define

$$ \epsilon _{\varGamma }(\mathcal {A},S) := \Pr \bigl ( V(C, \mathcal {A}(C) ) = 1 \mid C \in U_\varGamma (S) \bigr ) \,. $$

Typically, \(U_\varGamma (\emptyset )= \mathcal {C}\) and thus \(\epsilon (\mathcal {A}) = \epsilon _{\varGamma }(\mathcal {A},\emptyset )\), i.e., all challenges \(c \in \mathcal {C}\) are potentially useful. However, this is not necessarily the case.

Given oracle access to \(\mathcal {A}\), the goal of the extractor is to find correct outputs \(y_1,\ldots , y_k\) for challenges \(c_1,\ldots , c_k \in \mathcal {C}\) such that \(\{c_1,\dots ,c_k\} \in \varGamma \), i.e., such that \({V(c_i, y_i) = 1}\) for all i. If \(\mathcal {A}\) corresponds to a dishonest prover attacking a \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound interactive proof on some input x, a witness w for statement x can be efficiently computed from the outputs \(y_1,\dots ,y_k\).

Let us further define the following quality measure for the algorithm \(\mathcal {A}\):

$$\begin{aligned} \delta _{\varGamma }(\mathcal {A}):= \min _{S \notin \varGamma } \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \notin S \bigr ) \, . \end{aligned}$$
(1)

The value \(\delta _{\varGamma }(\mathcal {A})\) defines a “punctured” success probability of \(\mathcal {A}\), i.e., it equals the success probability of \(\mathcal {A}\) when the challenge c is sampled uniformly at random from some set \(\mathcal {C}\setminus S \supseteq U_\varGamma (S)\) such that S is not in the monotone structure. We will show that the value \(\delta _\varGamma (\mathcal {A})\) measures how well we can extract from the algorithm \(\mathcal {A}\). The value \(\delta _\varGamma (\mathcal {A})\) is a generalization of the measure

$$ \delta _k(\mathcal {A}):= \min _{S\subseteq \mathcal {C}: |S|=k-1} \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \notin S \bigr ) \,, $$

defined in [3].Footnote 4 However, when restricting to threshold monotone structures, there is a syntactic difference between the definitions of \(\delta _k(\mathcal {A})\) and \(\delta _{\varGamma }(\mathcal {A})\). To see this, let \(\mathcal {T}_k\) denote the monotone structure containing all subsets of \(\mathcal {C}\) with cardinality at least k. Then, in the definition of \(\delta _k(\mathcal {A})\) the minimum is over all sets of cardinality exactly \(k-1\), whereas the corresponding \(\delta _{\mathcal {T}_k}(\mathcal {A})\) is a minimum over all sets of size at most \(k-1\). In the threshold case this makes no difference: it is easily seen that there always exists a (maximal) set of size \(k-1\) that minimizes \(\delta _{\mathcal {T}_k}(\mathcal {A})\) and so indeed \(\delta _{\mathcal {T}_k}(\mathcal {A}) = \delta _k(\mathcal {A})\). A similar result does not hold for arbitrary access structures, i.e., in general the minimum may not be attained by a maximal set \(S \notin \varGamma \). This issue will reoccur in a more substantial way when addressing multi-round protocols.

For any set \(T \in 2^\mathcal {C}\setminus \varGamma \), we also define

$$ \delta _{\varGamma }(\mathcal {A},T) := \min _{S : S \cup T \notin \varGamma } \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \notin S\bigr ) \,. $$

Since \(S \cup T \notin \varGamma \) implies \(S\cup T' \notin \varGamma \) for all \(T'\subseteq T\), it follows that

$$\begin{aligned} \delta _{\varGamma }(\mathcal {A},T') \le \delta _{\varGamma }(\mathcal {A},T),\quad \forall \, T' \subseteq T\,. \end{aligned}$$
(2)

Further, by Lemma 3, it follows that \(\bigl (\mathcal {C}\setminus U_{\varGamma }(T)\bigr ) \cup T \notin \varGamma \) for all \(T \notin \varGamma \). Hence,

$$\begin{aligned} \begin{aligned} \delta _{\varGamma }(\mathcal {A},T) & = \min _{S : S \cup T \notin \varGamma } \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \notin S \bigr ) \\ & \le \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \notin \mathcal {C}\setminus U_\varGamma (T) \bigr ) \\ & = \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \in U_\varGamma (T) \bigr ) \\ & = \epsilon _{\varGamma }(\mathcal {A},T)\,. \end{aligned} \end{aligned}$$
(3)

We are now ready to define and analyze our extraction algorithm for \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound interactive \(\varSigma \)-protocols. The extractor is defined in Fig. 1 and its properties are summarized in the following lemma.

Lemma 5

(Extraction Algorithm - \(\varSigma \)-protocols). Let \((\varGamma ,\mathcal {C})\) be a nonempty monotone structure and let \(V :\mathcal {C}\times \{0,1\}^* \rightarrow \{0,1\}\). Then there exists an oracle algorithm \(\mathcal {E}_{\varGamma }\) with the following properties: The algorithm \(\mathcal {E}_{\varGamma }^{\mathcal {A}}\), given oracle access to a (probabilistic) algorithm \(\mathcal {A}:\mathcal {C}\rightarrow \{0,1\}^*\), requires an expected number of at most \( 2t_\varGamma - 1 \) queries to \(\mathcal {A}\) and, with probability at least \(\delta _\varGamma (\mathcal {A})/t_\varGamma \), it outputs pairs \((c_1,y_1), (c_2,y_2),\dots , (c_k,y_k) \in \mathcal {C}\times \{0,1\}^*\) with \(V(c_i,y_i)= 1\) for all i and \(\{c_1,\dots ,c_k\} \in \varGamma \).

Fig. 1.
figure 1

Recursive Expected Polynomial Time Extractor \(\mathcal {E}_{\varGamma }^\mathcal {A}(S)\).

Proof

The extractor \(\mathcal {E}_\varGamma ^{\mathcal {A}}(S)\) is formally defined in Fig. 1. It takes as input a subset \(S\in 2^ \mathcal {C}\setminus \varGamma \). The input S represents the set of accepting challenges that the extractor has already found, i.e., the goal of \(\mathcal {E}_\varGamma ^{\mathcal {A}}(S)\) is to find pairs \((c_i,y_i)\) such that \(V(c_i,y_i)=1\) and \(\{c_1,\dots ,c_k\} \cup S \in \varGamma \). Further, we define

$$ \mathcal {E}_\varGamma ^{\mathcal {A}} := \mathcal {E}_{\varGamma }^{\mathcal {A}}(\emptyset ) \,. $$

First note that, since \(\varGamma \ne \emptyset \) and thus \(U_\varGamma (S) \ne \emptyset \) for all \(S \notin \varGamma \), the extractor is well-defined. Let us now analyze the success probability and the expected number of \(\mathcal {A}\)-queries of the extractor.

Success Probability. By induction over \(t_\varGamma (S)\), we will prove that \(\mathcal {E}_{\varGamma }^\mathcal {A}(S)\) succeeds with probability at least

$$ \frac{\delta _\varGamma (\mathcal {A},S)}{t_\varGamma (S)} \,. $$

We first consider the base case. To this end, let \(S \subseteq \mathcal {C}\) with \(t_\varGamma (S) = 1\). Then, by Lemma 4, for all \(c_1 \in U_\varGamma (S)\), it holds that \(t_\varGamma (S \cup \{c_1\}) = 0\) and thus \(S \cup \{c_1\} \in \varGamma \). Therefore, the extractor succeeds if and only if \(V\bigl (c_1,\mathcal {A}(c_1)\bigr )=1\) for the \(c_1\) sampled from \(U_\varGamma (S)\). Hence, the success probability of the extractor equals

$$ \begin{aligned} \epsilon _{\varGamma }(\mathcal {A},S) & \ge \delta _\varGamma (\mathcal {A},S) \,, \end{aligned} $$

where the inequality follows from Eq. 3. This proves the bound on the success probability for the base case \(t_\varGamma (S) =1\).

Let us now consider an arbitrary subset \(S \subseteq \mathcal {C}\) with \(t_\varGamma (S) >1\) and assume that the claimed bound holds for all subsets \(T\subseteq \mathcal {C}\) with \(t_\varGamma (T) < t_\varGamma (S)\).

In the first step, the extractor succeeds with probability \(\epsilon _{\varGamma }(\mathcal {A},S)\) in finding a \(c_1 \in U_\varGamma (S)\) and \(y_1 \leftarrow \mathcal {A}(c_1)\) with \(V(c_1,y_1)=1\). If \(\{c_1\} \cup S \in \varGamma \), the extractor has successfully completed its task. If not, the extractor starts running two geometric experiments until one of them finishes. In the first geometric experiment the extractor repeatedly runs \(\mathcal {E}_{\varGamma }^\mathcal {A}( S \cup \{c_1\})\). By Lemma 4, it holds that \({t_\varGamma ( S \cup \{c_1\} ) < t_\varGamma (S)}\). Hence, by the induction hypothesis, \(\mathcal {E}_{\varGamma }^\mathcal {A}( S \cup \{c_1\})\) succeeds with probability

$$ p \ge \frac{\delta _{\varGamma }(\mathcal {A}, S \cup \{c_1\})}{t_\varGamma ( S \cup \{c_1\})} \ge \frac{\delta _{\varGamma }(\mathcal {A}, S )}{t_\varGamma ( S) - 1} \,, $$

where the second inequality follows from Eq. 2 and Lemma 4. In the second geometric experiment, the extractor tosses a coin that returns heads with probability

$$ q:=\epsilon _{\varGamma }(\mathcal {A},S) \,. $$

The second step of the extractor succeeds if the second geometric experiment does not finish before the first, and so by Lemma 1 this probability is lower bounded as follows

$$ \begin{aligned} \Pr \bigl ( {\text {Geo}}(p) \le {\text {Geo}}(q) \bigr ) & \ge \frac{p}{p+q} \ge \frac{\frac{\delta _{\varGamma }(\mathcal {A},S)}{t_\varGamma ( S)-1}}{\frac{\delta _{\varGamma }(\mathcal {A},S)}{t_\varGamma ( S)-1} + \epsilon _{\varGamma }(\mathcal {A},S) } \\ {} & \ge \frac{\frac{\delta _{\varGamma }(\mathcal {A},S)}{t_\varGamma ( S)-1}}{\frac{\epsilon _{\varGamma }(\mathcal {A},S)}{t_\varGamma ( S)-1} + \epsilon _{\varGamma }(\mathcal {A},S) } = \frac{\delta _{\varGamma }(\mathcal {A},S)}{t_\varGamma (S) \cdot \epsilon _{\varGamma }(\mathcal {A},S) } \, , \end{aligned} $$

where the second inequality follows from the monotonicity of the function and the third inequality follows from the fact that \(\delta _{\varGamma }(\mathcal {A},S) \le \epsilon _{\varGamma }(\mathcal {A},S)\) (Eq. 3).

Since the first step of the extractor succeeds with probability \(\epsilon _\varGamma (\mathcal {A},S)\), it follows that \(\mathcal {E}_{\varGamma }^{\mathcal {A}}(S)\) succeeds with probability at least \(\delta _\varGamma (\mathcal {A},S)/t_\varGamma (S)\) for all \(S \in 2^\mathcal {C}\setminus \varGamma \), which proves the claimed bound. In particular, \(\mathcal {E}_{\varGamma }^{\mathcal {A}}\) succeeds with probability at least \(\delta _{\varGamma }(\mathcal {A})/t_\varGamma \).

Expected Number of \(\mathcal {A}\)-Queries. By induction over \(t_\varGamma (S)\), we will prove that the expected number of \(\mathcal {A}\)-queries \(Q_\varGamma (S)\) made by \(\mathcal {E}_{\varGamma }^\mathcal {A}(S)\) is upper bounded as follows:

$$ Q_\varGamma (S) \le 2t_\varGamma (S)-1\,. $$

We first consider the base case. To this end, let \(S \subseteq \mathcal {C}\) with \(t_\varGamma (S) = 1\). In this case, \(\{c_1\} \cup S \in \varGamma \) for all \(c_1 \in U_\varGamma (S)\). Hence, \(\mathcal {E}_{\varGamma }^\mathcal {A}(S)\) either succeeds or fails after making exactly one \(\mathcal {A}\)-query, i.e., \(Q_\varGamma (S) = 1 = 2t_\varGamma (S)-1\), which proves the base case.

Let us now consider an arbitrary subset \(S \subseteq \mathcal {C}\) with \(t_\varGamma (S) >1\) and assume that \(Q_\varGamma (T) \le 2 t_\varGamma (T)-1\) for all subsets \(T\subseteq \mathcal {C}\) with \(t_\varGamma (T) < t_\varGamma (S)\).

The extractor \(\mathcal {E}_{\varGamma }^\mathcal {A}(S)\) first samples \(c_1\leftarrow _R U_\varGamma (S)\) uniformly at random and evaluates \(y_1\leftarrow \mathcal {A}(c_1)\). This requires exactly one \(\mathcal {A}\)-query. After this step the extractor aborts with probability \(1-\epsilon _\varGamma (\mathcal {A},S)\). Otherwise, and if \(\{c_1\} \cup S \notin \varGamma \), it continues running the two geometric experiments until either one of them finishes. The second geometric experiment finishes in an expected number of \(1/\epsilon _\varGamma (\mathcal {A},S)\) trials and requires exactly one \(\mathcal {A}\)-query per trial. Hence, the total expected number of trials for both experiments is at most \(1/\epsilon _\varGamma (\mathcal {A},S)\). Further, since \(t_\varGamma (S \cup \{c_1\}) < t_\varGamma (S)\) (Lemma 4) and by the induction hypotheses, the expected number of \(\mathcal {A}\)-queries of the first geometric experiment is at most

$$ Q_\varGamma (S\cup \{c_1\}) \le 2t_\varGamma (S\cup \{c_1\}) -1 \le 2t_\varGamma (S) - 3 \,, $$

per iteration, where the second inequality follows again from Lemma 4. Hence, every iteration of the repeat loop requires an expected number of at most \(2t_\varGamma (S) - 2\) \(\mathcal {A}\)-queries.

From this it follows that

$$ Q_\varGamma (S) \le 1 + \epsilon _{\varGamma }(\mathcal {A},S) \frac{2t_\varGamma (S) - 2}{\epsilon _{\varGamma }(\mathcal {A},S)} = 2t_\varGamma (S) - 1 \,, $$

for all \(S \in 2^\mathcal {C}\setminus \varGamma \). In particular, \(\mathcal {E}_\varGamma ^\mathcal {A}\) requires an expected number of at most \(2t_\varGamma -1\) \(\mathcal {A}\)-queries, which completes the proof of the lemma.

   \(\square \)

By basic probability theory, for any \(S \notin \varGamma \),

$$ \begin{aligned} \Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \mid C \notin S \bigr ) & = \frac{\Pr \bigl ( V( C, \mathcal {A}(C) ) = 1 \,\wedge \, C \notin S \bigr )}{\Pr \bigl (C \notin S \bigr )} \\ & \ge \frac{\Pr \bigl ( V( C, \mathcal {A}(C) ) = 1) - \Pr \bigl (C \in S \bigr )}{\Pr \bigl (C \notin S \bigr )} \\ & = \frac{\epsilon (\mathcal {A}) - \Pr \bigl (C \in S \bigr )}{1- \Pr \bigl (C \in S \bigr )} \\ & = \frac{\epsilon (\mathcal {A}) - \left|S\right|/\left|\mathcal {C}\right|}{1- \left|S\right|/\left|\mathcal {C}\right|} \, . \end{aligned} $$

Hence, taking the minimum over all \(S \notin \varGamma \) shows that

$$\begin{aligned} \begin{aligned} \delta _{\varGamma }(\mathcal {A}) \ge \frac{\epsilon (\mathcal {A}) - \kappa _\varGamma }{1- \kappa _\varGamma }\,, \end{aligned} \end{aligned}$$
(4)

where \(\kappa _\varGamma = \max _{S \notin \varGamma } \left|S\right|/\left|\mathcal {C}\right|\). In \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound interactive proofs, a dishonest prover can potentially take any \(S \notin \varGamma \) and choose the first message so that it will succeed if the verifier chooses a challenge \(c \in S\). Hence, \(\kappa _\varGamma \) equals the trivial cheating strategy for \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound interactive proofs.

Since the extractor succeeds with probability at least \(\delta _{\varGamma }(\mathcal {A})/t_\varGamma \), the following theorem follows.

Theorem 1

Let \((\mathcal {P},\mathcal {V})\) be a \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound \(\varSigma \)-protocol such that \(t_\varGamma \) is polynomial in the size \(\left|x\right|\) of the public input statement x of \((\mathcal {P},\mathcal {V})\) and sampling from \(U_\varGamma (S)\) takes polynomial time (in \(\left|x\right|\)) for all S with \(\left|S\right| < t_\varGamma \). Then \((\mathcal {P},\mathcal {V})\) is knowledge sound with knowledge error \(\kappa _\varGamma = \max _{S \notin \varGamma } \left|S\right|/\left|\mathcal {C}\right|\).

5 Examples

In this section, we describe three very simple interactive proofs and their special-soundness properties. The first example shows that for the special case of k-out-of-N special-soundness notion, we recover the known results. The second and third example present techniques that have found numerous applications, but cannot be analyzed via their threshold special-soundness properties, i.e., these interactive proofs require an alternative analysis. Our knowledge extractor offers the means to easily handle these interactive proof as well. Finally, the fourth example shows that our generic techniques do not always suffice. In Sect. 7, we will consider a more complicated protocol and demonstrate how our techniques enable a knowledge soundness analysis of the multi-round protocol FRI [5].

Example 1

(Threshold Access Structures). Let \(\mathcal {C}\) be a finite set with cardinality N, and let \(\varGamma \) be the monotone structure that contains all subsets of \(\mathcal {C}\) of cardinality at least \(k\le N\). Then a \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound interactive proof is also k-out-of-N special-sound. Moreover, \(U_\varGamma (A) = \mathcal {C}\setminus A\) for all \(A \notin \varGamma \), \(t_\varGamma =k\), and \(\kappa _\varGamma = (k-1)/N\). Hence, in the case of k-out-of-N special-soundness, we recover the results from [3].

Example 2

(Standard Amortization Technique). Let \(\mathbb {F}\) be a finite field and let \(\varPsi \) be an \(\mathbb {F}\)-linear map. The following amortization technique, known from \(\varSigma \)-protocol theory, allows a prover to prove knowledge of n \(\varPsi \)-preimages \(x_1,\dots ,x_n\) of \(P_1,\dots ,P_n\) for essentially the cost of one. The amortization technique is a 2-round protocol that proceeds as follows. First, the verifier samples a challenge vector \(\textbf{c}= (c_1,\dots ,c_n) \in \mathbb {F}^n\) uniformly at random. Second, upon receiving the challenge vector \(\textbf{c}\), the prover responds with the element \(z = \sum _{i=1}^n c_i x_i\). Finally, the verifier checks that \(\varPsi (z) = \sum _{i=1}^n c_i P_i\). Hence, instead of sending n preimages the prover only has to send one preimage.

The n preimages \(x_1,\dots ,x_n\) of \(P_1,\dots ,P_n\) can be extracted from accepting transcripts \((\textbf{c}_1,z_1),\dots ,(\textbf{c}_k,z_k)\) if the challenge vectors \(\textbf{c}_1,\dots ,\textbf{c}_k\) span the vector space \(\mathbb {F}^n\). Hence, the amortization protocol is \(\varGamma \)-out-of-\(\mathbb {F}^n\) special-sound, where \(\varGamma \) is the monotone structure that contains all subsets spanning \(\mathbb {F}^n\). Further, \(t_\varGamma = n\), \(U_\varGamma (A) = \mathbb {F}^n \setminus {\text {span}}(A)\) for all \(A \notin \varGamma \); and \(\kappa _\varGamma = 1/\left|\mathbb {F}\right|\); thus, we obtain optimal knowledge soundness.

At the same time, the amortization protocol is \((\left|\mathbb {F}\right|^{n-1}+1)\)-out-of-\(\left|\mathbb {F}\right|^n\) special-sound, i.e., the threshold special-soundness parameter of this protocol is \(\left|\mathbb {F}\right|^{n-1}+1\), which is much larger than \(t_\varGamma = n\). In fact, the parameter \({\left|\mathbb {F}\right|^{n-1}+1}\) is typically not polynomially bounded, in which case knowledge soundness can not be derived from this threshold special-soundness property.

Example 3

(Merkle Tree Commitments). Let us now consider an interactive proof for proving knowledge of the opening of a Merkle tree commitment P, i.e., P is the root of a Merkle tree and the prover claims to know all n leafs. To verify this claim, the verifier selects a subset S of k (distinct) indices between 1 and n uniformly at random. The prover sends the corresponding leafs together with their validation paths, which are checked by the verifier.

An opening of the commitment P can be extracted from accepting transcripts \((S_1,z_1),\dots ,(S_\ell ,z_\ell )\) if the subsets \(S_i\) cover \(\{1,\dots ,n\}\). Hence, this interactive proof is \(\varGamma \)-out-of-\(\mathcal {C}\), where

$$ \mathcal {C}= \{ S \subseteq \{1,\dots ,n\} : \left|S\right| =k \} \quad \text {and} \quad \varGamma = \bigl \{ \mathcal {D}\subseteq \mathcal {C}: \bigcup _{S\in \mathcal {D}} S = \{1,\dots ,n\} \bigr \}\,. $$

Further, \({t_\varGamma = n-k+1}\), \(U_\varGamma (\mathcal {D}) = \{ A \in \mathcal {C}: A \not \subseteq \bigcup _{S \in \mathcal {D}} S \}\) for all \(\mathcal {D}\notin \varGamma \), and \(\kappa _\varGamma = (n-k)/n\); thus, we obtain optimal knowledge soundness.

The threshold special-soundness parameter of this protocol is \({\left( {\begin{array}{c}n-1\\ k\end{array}}\right) +1}\) which is typically much larger than \(t_\varGamma = n-k+1\). Hence, also in this case our generalization provides a much more efficient knowledge extractor.

This simple interactive proof is an essential component in many more complicated protocols based on probabilistically checkable proofs (PCPs), interactive oracle proofs (IOPs) or MPC-in-the-head.

Example 4

(Parallel Repetition). Finally, we consider an example where our generic technique does not work. To this end, let \(\varPi ^t\) be the t-fold parallel composition of a k-out-of-N special-sound interactive proof \(\varPi \) with challenge set \(\mathcal {C}\), i.e., \(\varPi ^t\) has challenge set \(\mathcal {C}^t\). Then, as discussed in the introduction, \(\varPi ^t\) is \(\bigl ((k-1)^t+1\bigr )\)-out-of-\(N^t\) special-sound, i.e., its threshold special-soundness parameter \((k-1)^t+1\) grows exponentially in t (if \(k>2\)).

The parallel repetition \(\varPi ^t\) is also \(\varGamma \)-out-of-\(\mathcal {C}^t\) special-sound, where \(\varGamma \) contains all subsets of challenge vectors \(\textbf{c}\in \mathcal {C}^t\) such that there is one position \(1\le i \le t\) where the challenge vectors feature at least k different values. Then, \(\kappa _\varGamma = (k-1)^t/N^t\). However, \(t_\varGamma = (k-1)^t+1\), i.e., \(t_\varGamma \) equals the threshold special-soundness parameter and grows exponentially in t. Hence, in this particular example, the correct access structure does not yield an efficient extractor. Fortunately, here we can apply the parallel repetition result of [3].

6 Knowledge Extraction for Multi-round Interactive Proofs

Let us now move to the analysis of multi-round interactive proofs \((\mathcal {P},\mathcal {V})\). To this end, we first generalize the notion of \(\varGamma \)-out-of-\(\mathcal {C}\) special-soundness to multi-round interactive proofs. A \(2\mu +1\)-round interactive proof is said to be \((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) if there exists an efficient algorithm that can extract a witness from appropriate trees of transcripts. Before we formally define trees of transcripts, we first define the related trees of challenges.

Definition 9

(Tree of Challenges). Let \((\varGamma _i,\mathcal {C}_i)\) be monotone structures for \(1 \le i \le \mu \). A set containing a single challenge vector \((c_1,\dots ,c_\mu ) \in \mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \) is also referred to as a \((1,\dots ,1)\)-tree of challenges. Further, for \(1\le t \le \mu \), a \((1,\dots ,1,\varGamma _t,\dots ,\varGamma _\mu )\)-tree \(T_t\) of challenges is the union of several \((1,\dots ,1,\varGamma _{t+1},\dots ,\varGamma _\mu )\)-trees, such that

  • The first \(t-1\) coordinates of all \(\textbf{c}\in T_t \subseteq \mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \) are equal;

  • The t-th coordinates of the tree elements form an element in \(\varGamma _t\), i.e.,

    $$ \{ c \in \mathcal {C}_t : \exists (c_1,\dots ,c_{t-1},c,c_{t+1},\dots ,c_\mu )\in T_t \} \in \varGamma _t\,. $$

Trivially, the verifier’s messages in a transcript of a \(2\mu +1\)-round interactive proof with challenge sets \(\mathcal {C}_1,\dots ,\mathcal {C}_\mu \) form a \((1,\dots ,1)\)-tree of challenges. Hence, by adding the prover’s messages we obtain a \((1,\dots ,1)\)-tree of transcripts, and thus, in the obvious way, we obtain the notion of a tree of transcripts. The only additional requirement is that the prover’s messages collide, i.e., they are uniquely determined by the challenges received before sending the message. In particular, the first message of every transcript is the same. Note that if the transcripts are generated by a deterministic prover, this property is guaranteed to hold.

Definition 10

(Tree of Transcripts). Let \((\varGamma _i,\mathcal {C}_i)\) be monotone structures for \(1 \le i \le \mu \). Let \((\mathcal {P},\mathcal {V})\) be a \(2\mu +1\)-round public-coin interactive proof with challenge sets \(\mathcal {C}_1,\dots ,\mathcal {C}_\mu \). A \((\varGamma _1,\dots ,\varGamma _\mu )\)-tree of transcripts is a set of protocol transcripts, such that

  • The corresponding set of challenge vectors, obtained by ignoring the prover’s messages, is a \((\varGamma _1,\dots ,\varGamma _\mu )\)-tree of challenges;

  • The prover’s messages collide, i.e., if two transcripts \((a_0,c_1,a_1,\dots ,c_\mu ,a_\mu )\) and \((a'_0,c'_1,a_1',\dots ,c'_\mu ,a_\mu ')\) are both in the tree, and \(c_i = c'_i\) for all \(i \le j\), then also \(a_i = a'_i\) for all \(i \le j\).

Prior works (e.g., [2, 7, 8]) considered \((k_1,\dots ,k_\mu )\)-trees, where \(k_i \in \mathbb {N}\) for all i. These are special cases of the above defined trees. More precisely, if \(\varGamma _i=\{ S \subseteq \mathcal {C}_i : \left|S\right| \ge k_i \}\), a \((k_1,\dots ,k_\mu )\)-tree is the same as a \((\varGamma _1,\dots ,\varGamma _t)\)-tree.

We are now ready to define a generalized multi-round special-soundness notion.

Definition 11

(\((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) Special-Soundness). Let \((\varGamma _i,\mathcal {C}_i)\) be monotone structures for \(1 \le i \le \mu \). A \(2\mu +1\)-round public-coin interactive proof \((\mathcal {P},\mathcal {V})\) for relation R, with challenge sets \(\mathcal {C}_1,\dots ,\mathcal {C}_\mu \), is \((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) special-sound if there exists a polynomial time algorithm that, on input a statement x and a \((\varGamma _1,\dots ,\varGamma _\mu )\)-tree of accepting transcripts, outputs a witness \(w \in R(x)\). We also say that \((\mathcal {P},\mathcal {V})\) is \((\varGamma _1,\dots ,\varGamma _\mu )\)-special-sound.

Remark 3

The monotone access structure \(\bigl (\varGamma _\textsc {Tree}(\mathbf {\varGamma }), \mathcal {C}_1\times \cdots \times \mathcal {C}_\mu \bigr )\), where \(\mathbf {\varGamma }= (\varGamma _1,\dots ,\varGamma _\mu )\) and

$$ \varGamma _\textsc {Tree}(\varGamma _1,\dots ,\varGamma _\mu ) := \{ S \subseteq \mathcal {C}_1\times \cdots \times \mathcal {C}_\mu : S \text { contains a } (\varGamma _1,\dots ,\varGamma _\mu )\text {-tree} \}\, , $$

allows one to cast a multi-round \((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) special-sound interactive proof as a \(\varGamma \)-out-of-\(\mathcal {C}\) special-sound interactive proof. Therefore, in principle, one could immediately apply the results from Sect. 4. However, typically, this results in an inefficient knowledge extractor. More precisely, the value \(t_{\varGamma _{\textsc {Tree}}(\mathbf {\varGamma })}\), and thus the expected running time of the extractor, grows linearly in the product of the sizes of the challenge sets \(\mathcal {C}_1,\dots ,\mathcal {C}_{\mu -1}\). For this reason, our multi-round knowledge extractor will proceed recursively over the different rounds.

Our goal is now to prove that, for appropriate monotone structures, \((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) special-soundness (tightly) implies knowledge soundness. As before, again borrowing the notation from [3], we present our results in a more abstract language. To this end, let \(\mathcal {A}:\mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \rightarrow \{0,1\}^*\) be a probabilistic algorithm and

$$ V :\mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \times \{0,1\}^* \rightarrow \{0,1\} $$

a verification function. The success probability of \(\mathcal {A}\) is denoted as

$$ \epsilon (\mathcal {A}) := \Pr \bigl ( V\bigl (C, \mathcal {A}(C) \bigr ) = 1 \bigr ) \,, $$

where C is distributed uniformly at random over \(\mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \). The obvious instantiation of \(\mathcal {A}\) is again a deterministic prover \(\mathcal {P}^*\) attacking a \((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) special-sound interactive proof.

It turns out that defining the multi-round version of \(\delta _\varGamma \) is somewhat subtle. In the case of a \(\textbf{k}\)-special sound protocol, it is defined in [3] as

$$ \begin{aligned} \delta ^V_{\textbf{k}}&(\mathcal {A}) := \\ & \min _{S_1 \cdots S_\mu } \Pr \bigl ( V( C, \mathcal {A}( C) ) = 1 \, \big | \, C_1 \not \in S_1, C_2 \not \in S_2(C_1), C_3 \not \in S_2(C_1,C2), \ldots \bigr ) \end{aligned} $$

where the minimum is over all sets \(S_1 \subset 2^{\mathcal {C}_1}\) with \(|S_1| = k_1 - 1\), all functions \(S_2: \mathcal {C}_1 \rightarrow 2^{\mathcal {C}_2}\) with \(|S_2(c_1)| = k_2 - 1\) for all \(c_1 \in C_1\), etc.Footnote 5 Thus, the natural extension to \((\varGamma _1,\dots ,\varGamma _\mu )\)-special-sound protocols would be to use the very same expression but minimize over all (maximal) sets \(S_1 \subset 2^{\mathcal {C}_1}\) with \(S_1 \notin \varGamma _1\), all functions \(S_2: \mathcal {C}_1 \rightarrow 2^{\mathcal {C}_2}\) with \(S_2(c_1)\) (maximal and) not in \(\varGamma _2\) for all \(c_1 \in \mathcal {C}_1\), etc.

However, writing \(\mathbf {\varGamma }= (\varGamma _1,\varGamma _2,\dots ,\varGamma _\mu )\), it turns out that defining \(\delta ^V_{\mathbf {\varGamma }}\) in this way will not lead to the desired results. In essence, the problem lies in the fact that the condition \(C_2 \not \in S_2(C_1)\) may bias the distribution of \(C_1\), namely when \(S_2(c_1)\) has different cardinality for different choices of \(c_1\). This issue is avoided in the threshold case by requiring the \(S_i\)’s to be maximal sets; here in the general case, this does not work, since different maximal sets may have different cardinality.

Because of this reason, we define \(\delta ^V_{\mathbf {\varGamma }}\) by the following, harder to comprehend, expression:

$$\begin{aligned} \begin{aligned} \delta ^V_{\mathbf {\varGamma }}(\mathcal {A}) & := \min _{S_1 \cdots S_\mu } \sum _\textbf{c}\Pr \bigl ( V\bigl (C, \mathcal {A}(C) \bigr ) =1 \wedge C = \textbf{c}\mid C_1 \notin S_1, \\ & \qquad \quad C_2 \notin S_2(c_1), \dots , C_\mu \notin S_{\mu }(c_1,\dots ,c_\mu ) \bigr )\,, \end{aligned} \end{aligned}$$
(5)

where, as in the above approach, the minimum is over all sets \(S_1 \subset 2^{\mathcal {C}_1}\) with \(S_1 \notin \varGamma _1\), all functions \(S_2: \mathcal {C}_1 \rightarrow 2^{\mathcal {C}_2}\) with \(S_2(c_1) \notin \varGamma _2\) for all \(c_1 \in \mathcal {C}_1\), etc.

Remark 4

Note that, in the special case of 3-round interactive proofs, i.e., if \(\mu =1\), it holds that

$$ \begin{aligned} \delta ^V_{\varGamma }(\mathcal {A}) & = \min _{S \notin \varGamma } \sum _c \Pr \bigl ( V\bigl (C, \mathcal {A}(C) \bigr ) =1 \wedge C = c \mid C \notin S \bigr ) \\ & = \min _{S \notin \varGamma } \Pr \bigl ( V\bigl (C, \mathcal {A}(C) \bigr ) =1 \mid C \notin S \bigr ) \,. \end{aligned} $$

Hence, the multi-round version of \(\delta \) defined in Eq. 5, is indeed a generalization of the 3-round version defined in Eq. 1.

Remark 5

Let us consider the multi-round threshold case, i.e., let \(\mathcal {T}_\textbf{k}= (\mathcal {T}_{k_1},\dots ,\mathcal {T}_{k_\mu })\) with \(\mathcal {T}_{k_i}\) the monotone structure containing all subsets of \(\mathcal {C}_i\) with cardinality at least \(k_i\) for all i. Then, although not immediately obvious, it turns out that \( \delta _{\mathcal {T}_\textbf{k}}(\mathcal {A}) = \delta _\textbf{k}(\mathcal {A}) \) for all \(\mathcal {A}\).

By observing that for the non-vanishing terms in the sum, exploiting the independence of \(V\bigl (\textbf{c}, \mathcal {A}(\textbf{c}) \bigr )\) and C for a fixed \(\textbf{c}\),

$$ \begin{aligned} &\Pr \bigl ( V\bigl (C, \mathcal {A}(C) \bigr ) =1 \mid C = \textbf{c}, C_1 \notin S_1, C_2 \notin S_2(c_1), \cdots \bigr ) \\ &\qquad = \Pr \bigl ( V\bigl (\textbf{c}, \mathcal {A}(\textbf{c}) \bigr ) =1 \mid C = \textbf{c}, C_1 \notin S_1, C_2 \notin S_2(c_1), \cdots \bigr ) \\ &\qquad = \Pr \bigl ( V\bigl (\textbf{c}, \mathcal {A}(\textbf{c}) \bigr ) =1 \bigr ) \, , \end{aligned} $$

we can re-write the definition as

$$\begin{aligned} \delta ^V_{\mathbf {\varGamma }}(\mathcal {A}) & = \min _{S_1 \cdots S_\mu } \sum _\textbf{c}\Pr \bigl ( V\bigl (\textbf{c}, \mathcal {A}(\textbf{c}) \bigr ) =1 \bigr ) \Pr \bigl (C=\textbf{c}\mid C_1 \notin S_1, C_2 \notin S_2(c_1), \cdots \bigr ) \\ & = \min _{S_1 \cdots S_\mu } \sum _\textbf{c}\Pr \bigl ( V\bigl (\textbf{c}, \mathcal {A}(\textbf{c}) \bigr ) =1 \bigr ) \Pr \bigl (C_1=c_1 \mid C_1 \notin S_1\bigr ) \cdots \\ & \qquad \qquad \qquad \cdots \Pr \bigl (C_\mu =c_\mu \mid C_{\mu } \notin S_\mu (c_1,\dots ,c_{\mu -1})\bigr )\,. \end{aligned}$$

This shows that the definition captures the success probability of \(\mathcal {A}\) when the challenges are samples as follows (for given sets/functions \(S_1,S_2,\ldots \), over which the minimum is then taken): \(c_1\) is sampled uniformly at random subject to being outside of \(S_1\). Then, \(c_2\) is sampled uniformly at random subject to being outside of \(S_2(c_1)\). And so forth. We repeat, in general this is not the same as sampling \(c_1,\ldots ,c_\mu \) uniformly at random subject to \(c_1 \notin S_2\), \(c_2 \notin S_2(c_1)\), etc., which biases the choice of \(c_1\) towards those for which \(S_2(c_1)\) is small(er), while with the above sampling there is no bias on \(c_1\) (beyond the exclusion from \(S_1\)). Defining \(\delta _{\mathbf {\varGamma }}^V\) in this way is crucial to our work. Oftentimes, the verification function V is clear from context, in which case we simply write \(\delta _{\mathbf {\varGamma }}(\mathcal {A})\) instead of \(\delta ^V_{\mathbf {\varGamma }}(\mathcal {A})\).

Any choice of sets/functions \(S_1,\dots ,S_\mu \) considered in the minimization in Eq. 5 defines a subset

$$ X = \{(c_1,\dots ,c_\mu ) \in \mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \mid c_1 \in S_1 \vee \cdots \vee c_\mu \in S_\mu (c_1,\dots ,c_{\mu -1} ) \} $$

that does not contain a \(\mathbf {\varGamma }\)-tree. Hence, again the success probability is punctured by removing some set X from which we cannot extract and thus, for which a dishonest prover may (potentially) be successful. Moreover, every subset of \(\mathcal {C}_1\times \dots \times \mathcal {C}_\mu \) that does not contain a \(\mathbf {\varGamma }\)-tree is contained in a set X of this form. Hence, if a prover has positive success probability outside all such subsets X, i.e., if \(\delta _{\mathbf {\varGamma }}^V(\mathcal {A})>0\), then extraction of a \(\mathbf {\varGamma }\)-tree of accepting transcripts is in principle possible. However, it is far less obvious that extraction can also be done efficiently. The following lemma shows that, for appropriate monotone structures \((\varGamma _i,\mathcal {C}_i)\), an efficient extraction algorithm indeed exists. This is a generalization of [3, Lemma 4]. Using the notation we introduced here, their proof almost immediately carries over to this more generic setting. For completeness, we present the proof below.

Lemma 6

(Multi-round Extraction Algorithm). Let \({\mathbf {\varGamma }= (\varGamma _1,\dots ,\varGamma _\mu )}\) and \(\boldsymbol{\mathcal {C}}= \mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \) be such that \((\varGamma _i,\mathcal {C}_i)\) are nonempty monotone structures for all i. Further, let \(T := \prod _{i=1}^\mu t_{\varGamma _i}\) and \(V :\boldsymbol{\mathcal {C}}\times \{0,1\}^* \rightarrow \{0,1\}\). Then, there exists an algorithm \(\mathcal {E}^{\mathcal {A}}\) so that, given oracle access to any (probabilistic) algorithm \(\mathcal {A}:\boldsymbol{\mathcal {C}}\rightarrow \{0,1\}^*\), \(\mathcal {E}^{\mathcal {A}}\) requires an expected number of at most \( 2^{\mu }\cdot T \) queries to \(\mathcal {A}\) and, with probability at least \(\delta _{\mathbf {\varGamma }}(\mathcal {A})/T\), outputs pairs \((\textbf{c}_i,y_i) \in \boldsymbol{\mathcal {C}}\times \{0,1\}^*\) such that \(\{\textbf{c}_i\}_i\) is a \(\mathbf {\varGamma }\)-tree with \(V(\textbf{c}_i,y_i)= 1\) for all i.

For the proof of Lemma 6, we refer to the full version [1].

Let us now derive a lower bound on the value \(\delta _{\mathbf {\varGamma }}^V(\mathcal {A})\). To this end, for \(\textbf{c}= (c_1,\ldots ,c_\mu ) \in \mathcal {C}_1 \times \cdots \times \mathcal {C}_\mu \), we write \(V(\textbf{c})\) as a shorthand for \(V(\textbf{c}, \mathcal {A}(\textbf{c}))\). Furthermore, for any fixed choices of \(S_1, S_2, \ldots ,S_\mu \), as in the definition of \(\delta _{\mathbf {\varGamma }}(\mathcal {A})\) (Eq. 5), we introduce the event

$$\begin{aligned} \varOmega (\textbf{c}) &:= \big [ C_1 \notin S_1 \wedge C_2 \notin S_2(c_1) \wedge \dots \wedge C_{\mu } \notin S_{\mu }(c_1,\ldots ,c_{\mu -1}) \big ]\,. \end{aligned}$$

Then,

$$\begin{aligned} \sum _\textbf{c}\Pr \bigl ( V(\textbf{c}) = 1 \wedge C = \textbf{c}\mid \varOmega (\textbf{c}) \bigr ) &\ge \sum _\textbf{c}\Pr \bigl ( V(\textbf{c}) = 1 \wedge \varOmega (\textbf{c}) \wedge C = \textbf{c}\bigr ) \\ &= \sum _\textbf{c}\Pr \bigl ( V(C) = 1 \wedge \varOmega (C) \wedge C = \textbf{c}\bigr ) \\ &= \Pr \bigl ( V(C) = 1 \wedge \varOmega (C)\bigr ) \\ &\ge \Pr \bigl ( V(C) = 1\bigr ) - \Pr \bigl ( \lnot \varOmega (C)\bigr ) \, . \end{aligned}$$

Now note that

$$\begin{aligned} \begin{aligned} \Pr \bigl (\lnot \varOmega (C)\bigr ) & = 1- \Pr \bigl (\varOmega (C) \bigr ) \\ & = 1 - \Pr \bigl ( C_1 \notin S_1 \bigr ) \Pr \bigl ( C_2 \notin S_2(C_1) \mid C_1 \notin S_1 \bigr ) \cdots \\ &\le 1 - \bigg (1 - \max _{S_1 \notin \varGamma _1} \frac{|S_1|}{|\mathcal {C}_1|}\bigg ) \bigg (1 - \max _{S_2 \notin \varGamma _1} \frac{|S_2|}{|\mathcal {C}_2|}\bigg ) \cdots \\ &= \kappa _\mathbf {\varGamma }\, , \end{aligned} \end{aligned}$$
(6)

where

$$ \begin{aligned} \kappa _{\mathbf {\varGamma }} := \max _{S \notin \varGamma _{\textsc {Tree}}(\mathbf {\varGamma })} \frac{\left|S\right|}{\left|\mathcal {C}\right|} = 1 - \prod _{i=1}^\mu \biggl (1 - \max _{S_i \notin \varGamma _i} \frac{\left|S_i\right|}{\left|\mathcal {C}_i\right|} \biggr ) = 1 - \prod _{i=1}^\mu (1 - \kappa _{\varGamma _i} ) \,. \end{aligned} $$

We thus obtain that

$$\begin{aligned} \delta _{\mathbf {\varGamma }}(\mathcal {A}) \ge \epsilon (\mathcal {A}) - \kappa _\mathbf {\varGamma }\,. \end{aligned}$$
(7)

These observations complete the proof of the following theorem.

Theorem 2

Let \((\mathcal {P},\mathcal {V})\) be a \((\varGamma _1,\dots ,\varGamma _\mu )\)-out-of-\((\mathcal {C}_1,\dots ,\mathcal {C}_\mu )\) special-sound interactive proof such that \(T_\mathbf {\varGamma }= \prod _{i=1}^\mu t_{\varGamma _i}\) is polynomial in the size \(\left|x\right|\) of the public input statement x of \((\mathcal {P},\mathcal {V})\) and sampling from \(U_{\varGamma _i}(S_i)\) takes polynomial time (in \(\left|x\right|\)) for all \(1\le i\le \mu \) and \(S_i\subset \mathcal {C}_i\) with \(\left|S_i\right| < t_{\varGamma _i}\). Then \((\mathcal {P},\mathcal {V})\) is knowledge sound with knowledge error

$$ \kappa _{\mathbf {\varGamma }} = 1 - \prod _{i=1}^\mu \biggl (1 - \max _{S_i \notin \varGamma _i} \frac{\left|S_i\right|}{\left|\mathcal {C}_i\right|} \biggr ) \,. $$

7 Analysis of the FRI-Protocol

In this section we show how to use our generalized notion of special-soundness to demonstrate the existence of a quasi-polynomial time knowledge extractor with essentially optimal success probability for the Fast Reed-Solomon Interactive Oracle Proof of Proximity due to Ben-Sasson et al. [5], assuming it has been compiled into an interactive proof the natural way (i.e., the oracles are replaced by compact commitments to the vectors with a local opening functionality). We first provide the necessary background on the protocol before providing our analysis. We remark that we use ideas that were implicit in prior works; our main aim in this section is to demonstrate the utility of our generalized special-soundness notion and the accompanying knowledge extractor.

7.1 Preliminaries on Reed-Solomon Codes

Let \(\mathbb {F}\) be a finite field of cardinality q and \(S \subseteq \mathbb {F}\). Given a polynomial \(f(X) \in \mathbb {F}[X]\) we let \(f(S) = (f(s))_{s \in S}\) denote the vector of evaluations of f over the domain S (given in some arbitrary, but fixed, order). For an integer \(\ell \) we write \(S^{\cdot \ell }\) for the set of \(\ell \)-powers of elements in S, i.e. \(\{s^\ell : s \in S\}\).Footnote 6

For any \(0\le \rho \le 1\), the Reed-Solomon code \({\text {RS}}[\mathbb {F},S,\rho ] \subseteq \mathbb {F}^{|S|}\) consists of all evaluations over the domain S of polynomials \(F(X) \in \mathbb {F}[X]\) of degree less than \(\rho |S|\). In notation,

$$ {\text {RS}}[\mathbb {F},S,\rho ] := \{F(S) :F(X) \in \mathbb {F}[X] \wedge \deg (F) < \rho |S|\} \ . $$

In the sequel we will assume S is a multiplicative subgroup of \(\mathbb {F}^*\) of order a power of 2, with the understanding that our analysis should generalize readily to other “smooth” evaluation domains for FRI protocols. We further set \(\rho = 2^{-r}\) for an integer \(r < \log _2(|S|)\), which implies \(\rho |S| \in \mathbb {N}\) and that the dimension of \({\text {RS}}[\mathbb {F},S,\rho ]\) is precisely \(\rho |S|\).

Letting \(N=|S|\), we therefore have \(S = \langle \omega \rangle = \{1,\omega ,\omega ^2,\dots ,\omega ^{N-1}\}\), where \(\omega \) is a primitive N-th root of unity. Note then that \(S^{\cdot 2} = \langle \omega ^2 \rangle = \{1,\omega ^2,\omega ^4,\dots ,\omega ^{N-2}\}\) is a multiplicative subgroup of \(\mathbb {F}^*\) of order N/2. More generally, for any \(j=1,2,\dots ,\log _2(N)\), \(S^{\cdot 2^j} = \langle \omega ^{2^j}\rangle \) is multiplicative subgroup of \(\mathbb {F}^*\) of order \(N/2^j\).

Given two polynomials \(f(X), g(X) \in \mathbb {F}[X]\) we let \(d_S(f,g) := |\{s \in S:f(s) \ne g(s)\}|\) denote the number of points \(s \in S\) on which f and g differ. Equivalently, it denotes the (unnormalized) Hamming distance between the vectors f(S) and g(S).

Given a polynomial \(f\in \mathbb {F}[X]\), we let

$$ \delta _S(f) := \frac{\min _F\{d_S(f,F):F \in \mathbb {F}[X],\, \deg (F)<\rho |S|\}}{|S|} \ . $$

In other words, \(\delta _S(f)\) denotes the relative Hamming distance of f(S) to a closest codeword in \({\text {RS}}[\mathbb {F},S,\rho ]\).

7.2 FRI-Protocol

Let \(\mathcal {O}^f\) be an oracle implementing some function \(f :S \rightarrow \mathbb {F}\), which of course uniquely corresponds to a polynomial of degree less than \(N=|S|\). We are interested in the situation where a prover claims that f(X) is in fact a polynomial of degree \(<\rho N\), i.e., that \(f(S) \in {\text {RS}}[\mathbb {F},S,\rho ]\). In order to verify this, the verifier may make queries to \(\mathcal {O}^f\), but it is easy to see that in order to catch a lying prover the verifier must query each \(s \in S\) (or at least \(\varOmega (|S|)\) such points in order to catch the prover with good probability).

Thus, for soundness, we will be satisfied with rejecting oracles implementing functions that are far from low degree, i.e., such that \(\delta _S(f)\ge \delta \). However, here as well we cannot hope to catch cheating verifiers without making at least \(\rho N+1\) queries (as \(\rho N\) evaluations are always consistent with some polynomial of degree \(<\rho N\)). It turns out to be possible to make significantly less (i.e., just logarithmically many) oracle queries if we allow the verifier to interact with the prover.

The resulting protocols are referred to as interactive oracle proofs of proximity (IOPPs). In order to demonstrate the utility of our general special soundness notion, we will show how to analyze the Fast Reed-Solomon Interactive Oracle Proof of Proximity (FRI-protocol) [5].

In order to implement the oracle \(\mathcal {O}^f\) cryptographically, one makes use of a compact commitment scheme, typically via a Merkle tree [6]. In the following we denote the commitment to the vector \(F(S) = (F(s))_{s \in S}\) with public parameters \(\textsf {pp}\) by \(P\leftarrow \textsc {Com}_\textsf {pp}(F(S))\) and the local opening information for \(s \in S\) as \(\gamma _s\). For example, in the case of a Merkle tree the public parameters \(\textsf {pp}\) would be a description of the hash function used, while \(\gamma _s\) would give hash values for the co-path of the leaf corresponding to s. We also assume access to a procedure \(\textsc {Loc}_\textsf {pp}\) which takes as input a commitment P, a domain element s, a value \(y_s \in \mathbb {F}\) and the opening information \(\gamma _s\) and outputs 1 if and only if \(\gamma _s\) indeed certifies that P opens to \(y_s\) on the element s.

We can therefore view the (cryptographically compiled version of the) \(\textsf {FRI}\)-protocol as an interactive proof for the pair of relations \((\mathfrak {R}_0,\mathfrak {R}_\delta \cup \mathfrak {R}_{\textsf {coll}})\), where for a parameter \(\beta \in [0,1)\) we define

$$\begin{aligned} \mathfrak {R}_{\beta } &:= \bigl \{ (P,\textsf {pp}; F,B,(\gamma _s)_{s \in B}) : \deg (F) < \rho N~\wedge ~|B| \ge (1-\beta )N \\ &~~~~~\qquad \wedge ~\forall s \in B,~\textsc {Loc}_\textsf {pp}(P,s,F(s),\gamma _s)=1 \bigr \}\, , \end{aligned}$$

while

$$\begin{aligned} \mathfrak {R}_{\textsf {coll}} &:= \bigl \{(\textsf {pp}; s,y,y',\gamma ,\gamma ') : y \ne y' ~\wedge ~ \textsc {Loc}_\textsf {pp}(P,s,y,\gamma )=1 \\ &~~~~~\qquad \wedge ~ \textsc {Loc}_\textsf {pp}(P,s,y',\gamma ')=1\bigr \}\, . \end{aligned}$$

This means that completeness holds with respect to relation \(\mathfrak {R}_0\) and soundness holds with respect to \(\mathfrak {R}_\delta \cup \mathfrak {R}_\textsf {coll}\), where the latter refers to the “or-relation” which accepts a witness for one or the other instance. On the one hand, this says that a prover that committed to a low-degree polynomial will indeed convince the verifier of this fact. On the other hand, if a prover has a good probability of convincing the verifier then we can either extract a commitment to many coordinates that agree with a low-degree polynomial, or we can extract two distinct local openings from the same commitment (invalidating the binding property of the commitment).Footnote 7

Folding. An important ingredient in the FRI-protocol is a folding operation. For our specific choice of S, it is defined as follows: for \(f(X)\in \mathbb {F}[X]\) and \(c \in \mathbb {F}\), we define

$$ {\text {Fold}}\bigl (f(X),c\bigr ) = g(X) \in \mathbb {F}[X] $$

such that

$$ g(X^2)=\frac{f(X)+f(-X)}{2} + c \frac{f(X)-f(-X)}{2X}\,. $$

Intuitively, this folding operation considers the even-power monomials of f(X) and the odd-power monomials separately, obtains from these terms two polynomials of degree \(\deg (f)/2\), and takes a random linear combination of these polynomials. Importantly, the polynomial g(X) can then naturally be viewed as having degree roughly \(\deg (f)/2\) (i.e., the degree is halved) and its domain is naturally viewed as \(S^{\cdot 2} = \langle \omega ^2 \rangle \), which has order N/2. That is, the folded polynomial has its degree and domain halved.

A one round version of the FRI-protocol thus proceeds as follows. First, the prover commits to F(S), where it promises that \(F(S) \in {\text {RS}}[\mathbb {F},S,\rho ]\). The verifier picks a random challenge \(c \in \mathbb {F}\), sends it to the prover, and the prover responds with the folding G of F around c. The verifier first checks that \(\deg (G) < \rho N/2\). If yes, the verifier then chooses t points \(s_1,\dots ,s_t \in S\) (each uniformly at random and thus possibly colliding), and asks for the evaluations of F on all points \(\pm s_i\). It then checks that these evaluations are consistent with G, i.e., that \(G(s_i^2) = \frac{f(s_i)+f(-s_i)}{2} + c \frac{f(s_i)-f(-s_i)}{2s_i}\) for all \(1 \le i \le t\), and of course that these are indeed the values the prover committed to initially.

7.3 Analyzing the FRI-Protocol

In order to analyze the FRI-protocol, we must create an extractor that takes as input folding challenges and then openings for various points \(s \in S\) that are consistent with the folded polynomials (which are assumed to be low-degree). From two distinct folding challenges \(c,c' \in \mathbb {F}\), if G(X) and \(G'(X)\) are the foldings around c and \(c'\) respectively of the function the prover committed to, then we can create the following polynomial:

$$ F(X) = X\frac{G(X^2)-G'(X^2)}{c-c'} + \frac{cG'(X^2)-c'G(X^2)}{c-c'} \ . $$

Note that if G and \(G'\) have degree less than \(\rho N/2\), then indeed F would have degree less than \(\rho N\).

The extractor may also rewind the second phase of the protocol to obtain sets A and \(A'\) covering at least \((1-\delta )\) fraction of S. We can then conclude that we have consistent openings on their intersection \(A \cap A'\) (assuming that we do not violate the binding property of the commitment, i.e., that we do not extract a witness for the relation \(\mathfrak {R}_{\textsf {coll}}\)). The intersection \(A \cap A'\) covers a \((1-2\delta )\) fraction of S, so we have found a low-degree polynomial agreeing with the commitment on a \((1-2\delta )\) fraction of the points of S.

At this point, we could iterate this argument. However, iterating this argument over \(\mu \) folding rounds would cause us to only prove that the prover committed to a function that agrees with a low-degree polynomial on a \((1-2^\mu \delta )\)-fraction of the coordinates (assuming that we did not extract a collision in the commitment). This is quite unsatisfactory, as we would like to have \(\mu \) logarithmic in N and \(\delta \in (0,1)\) a constant. Fortunately, by relying on ideas from prior works (specifically, [5]) we can show that we can indeed extract a low-degree polynomial agreeing with the commitment on a \((1-\delta )\) fraction of coordinates (or, of course, a violation to the binding property of the commitment).

In order to analyze the soundness of the FRI-protocol more effectively, we will need the following coset-distance from f to \({\text {RS}}[\mathbb {F},S,\rho ]\):

$$ \varDelta _S(f) := \min _{F \in \mathbb {F}[X],~\deg (F)<\rho N}\frac{\left| \{ s \in S : f(s) \ne F(s) \vee f(-s) \ne F(-s)\}\right|}{N} \ . $$

This distance notion has been used in prior works [5]. Observe that \(\varDelta _S(f) \ge \delta _S(f)\). Intuitively, this measure is useful because it allows for a more careful accounting of how the Hamming metric behaves under the folding operation than the above naïve analysis. For this reason, our extractor will succeed assuming a bound on \(\varDelta _S(f)\) rather than just \(\delta _S(f)\).

The following lemma quantifies this intuition, by characterizing the set of challenges c that could cause the Hamming metric to decrease when a function f is folded around c. These ideas are implicit in [5, Lemma 4.4]; we restate them in a language that is convenient for us. The full version of this work [1] includes a proof of the following lemma.

Lemma 7

Let \(f(X) \in \mathbb {F}[X]\) be such that \(\varDelta _S(f) < (1-\rho )/2\). The number of choices for \(c \in \mathbb {F}\) such that \(\delta _{S^{\cdot 2}}\big ( {\text {Fold}}(f,c)\big ) < \varDelta _S(f)\) is at most N.

In particular, if there exist pairwise distinct \(c_0,\dots ,c_N \in \mathbb {F}\) such that \(\delta _{S^{\cdot 2}}\big ( {\text {Fold}}(f,c)\big ) \le \delta \) for all \(i \in \{0,1,\dots ,N\}\), then \(\varDelta _S(f) \le \delta \).

We now precisely define the notion of special-soundness that we will prove the FRI-protocol with one folding iteration satisfies. Informally, for the folding round the previous lemma tells us we need \({N+1}\) challenges to extract, while for the second round we need enough local openings of the commitment to reveal a \((1-\delta )\)-fraction of the values that the prover committed to. We now make this formal.

Let

$$ \mathcal {C}:= S^t = \bigl \{ (s_1,s_2,\dots ,s_t) : s_i \in S~\forall i\} \ . $$

For a challenge \(c = (s_1,\dots ,s_t) \in \mathcal {C}\) we denote by

$$ B(c) = \{s_1,-s_1,s_2,-s_2,\dots ,s_t,-s_t\} $$

the setFootnote 8 of elements of S that appear in the challenge tuple c, along with their negations. That is, it is the set of points that will be queried by the verifier if it samples \((s_1,s_2,\dots ,s_t)\) in the final verification step. Let \((\varGamma _{N+1},\mathbb {F})\) be the monotone structure that contains all subsets of \(\mathbb {F}\) of cardinality at least \(N+1\), and let \((\varGamma ,\mathcal {C})\) be the monotone structure that contains all subsets of \(\mathcal {C}\) that cover at least a \((1-\delta )\)-fraction of S, i.e.,

$$ A \in \varGamma \subset 2^{\mathcal {C}} \quad \iff \quad \left|\bigcup _{c \in A}B(c)\right| \ge (1-\delta ) N\,. $$

Theorem 3

(FRI-protocol (one folding iteration)). Let \(\rho = 2^{-r}\) for some \(r \in \{0,1,\dots ,m\}\) and let \(\delta \in (0,1)\) be such that \(\delta < \frac{1-\rho }{4}\). The FRI-Protocol is perfectly complete with respect to relation \(\mathfrak {R}_0\) and \((\varGamma _{N+1},\varGamma )\)-out-of-\((\mathbb {F}, \mathcal {C})\) special-sound with respect to relation \(\mathfrak {R}_\delta \cup \mathfrak {R}_{\textsf {coll}}\).

Proof

Completeness: This is immediate from prior work (e.g., [5]). To make our proof self-contained, we note that this follows immediately from the following facts concerning a polynomial \(F(X) \in \mathbb {F}[X]\):

  • if F has degree \(<\rho N\) then \({\text {Fold}}(F,c)\) has degree \(<\rho N/2\) for any \(c \in \mathbb {F}\); and

  • for any \(s \in S\) and \(c \in \mathbb {F}\), \({\text {Fold}}(F,c)(s^2) = \frac{F(s) + F(-s)}{2} + c \frac{F(s) -F(-s)}{2s}\).

Soundness: We must extract a witness for either the relation \(\mathfrak {R}_\delta \) or the relation \(\mathfrak {R}_{\textsf {coll}}\) given a \((\varGamma _{N+1},\varGamma )\)-tree of accepting transcripts. Such a tree of transcripts consists of the following:

  • folding challenges \(c_0,\dots ,c_N \in \mathbb {F}\),

  • polynomials \(G_0,\dots ,G_N \in \mathbb {F}[X]\) of degree less than \(\rho \frac{N}{2}\),

  • subsets \(A_0,\dots ,A_N \subseteq \mathcal {C}\), each satisfying \(\left|\bigcup _{c \in A_j}B(c)\right| \ge (1-\delta )N\), and

  • for each \(0 \le j \le N\), for each \(s \in \bigcup _{c \in A_j}B(c)\), opening information \(\gamma _{sj}\) for the element s. Let \(y_{sj} \in \mathbb {F}\) be the element for which \(\textsc {Loc}_\textsf {pp}(P,s,y_{sj},\gamma _{sj})=1\).

Let \(B_j := \bigcup _{c \in A_j}B(c)\) for \(0 \le j \le N\), and observe that these sets are closed under negation (i.e., \(s \in B_j \iff -s \in B_j\)).

Suppose there exists \(j \ne j'\) such that, for some \(s \in B_j \cap B_{j'}\), \(y_{sj}\ne y_{sj'}\). Then, we may output the following witness for the relation \(\mathfrak {R}_{\textsf {coll}}\): \((s,y_{sj},y_{sj'},\gamma _{sj},\gamma _{sj}')\).

We may now assume that the above does not occur. In other words, for each \(s \in \bar{B} := B_0 \cup \ldots \cup B_N\) the set \(\{y_{sj}:s \in B_j\}\) is in fact a singleton set; denote its unique element by \(y_s\). We also let \(\gamma _s := \gamma _{sj}\) where j is the smallest element in \(\{0,1,\dots ,N\}\) such that \(s \in B_j\) (this is just an arbitrary tie-breaking rule).

For each \(j \in \{0,1,\dots ,N\}\), the polynomial \(G_j\) and the elements \(y_s\) for \(s \in B_j\) satisfy the following relation:

$$ G_j(s^2) = \frac{y_s + y_{-s}}{2} + c_j \frac{y_s - y_{-s}}{2\,s} \ . $$

Let \(f(X) \in \mathbb {F}[X]\) be a polynomial consistent with the \(y_s\)’s, i.e., for all \(s \in \bar{B}\) we have \(f(s) = y_s\). Furthermore, for reasons to be clear later, we let f be different to the polynomial \(F_0\) defined below outside of \(\bar{B}\), i.e., \(f(s) \ne F_0(s)\) for all \(s \not \in \bar{B}\). Then, for each \(j \in \{0,1,\dots ,N\}\) and all \(s^2\) such that \(\{\pm s\} \subseteq B_j\), we have

$$ G_j(s^2) = {\text {Fold}}\big (f,c_j\big )(s^2) \ . $$

We conclude that \({\text {Fold}}\big (f,c_j\big )\) and \(G_j\) agree on at least \((1-\delta )\frac{N}{2}\) elements of \(S^{\cdot 2}\). As \(\deg (G_j) < \rho \frac{N}{2}\) it follows that

$$ \delta _{S^{\cdot 2}}\bigl ({\text {Fold}}\big (f,c_j\big )\bigr ) \le \delta \, . $$

By Lemma 7, if we establish that \(\varDelta _S(f) < \frac{1-\rho }{2}\), it in fact then follows that \(\varDelta _S(f) \le \delta \), which in turn implies \(\delta _S(f) \le \delta \). As \(2\delta < \frac{1-\rho }{2}\) by assumption, it suffices for us to show \(\varDelta _S(f) \le 2\delta \). We focus on proving this now.

Consider the polynomial

$$ F_0(X) := X \frac{G_0(X^2) - G_{1}(X^2)}{c_0 - c_{1}} + \frac{c_0 G_{1}(X^2) - c_{1} G_0(X^2)}{c_0-c_{1}} \ . $$

Since the degrees of \(G_0\) and \(G_{1}\) are smaller than \(\rho \frac{N}{2}\), it follows that \(\deg (F_0)<\rho N\). Furthermore, we note that for all \(s \in B_0 \cap B_1\) we have \(f(s) = F_0(s)\). Indeed,

$$\begin{aligned} F_0(s) &= s \cdot \frac{G_0(s^2) - G_{1}(s^2)}{c_0 - c_{1}} + \frac{c_0 G_{1}(s^2) - c_{1} G_0(s^2)}{c_0-c_{1}} \\ &= \frac{s}{c_0-c_{1}}\left[ \frac{f(s)+f(-s)}{2} + c_0 \frac{f(s)-f(-s)}{2\,s} \right. \\ &\quad \quad - \left. \left( \frac{f(s)+f(-s)}{2} + c_{1} \frac{f(s)-f(-s)}{2\,s}\right) \right] \\ &\quad + \frac{1}{c_0-c_{1}}\left[ c_0 \cdot \left( \frac{f(s)-f(-s)}{2} + c_{1}\frac{f(s)-f(-s)}{2\,s}\right) \right. \\ &\quad \quad \left. - c_{1} \cdot \left( \frac{f(s)+f(-s)}{2} + c_0 \frac{f(s)-f(-s)}{2\,s}\right) \right] \\ &= \frac{s}{c_0-c_{1}} \cdot (c_0-c_{1})\frac{f(s)-f(-s)}{2\,s} + \frac{1}{c_0-c_{1}} \cdot (c_0-c_{1})\frac{f(s)+f(-s)}{2} \\ &= \frac{f(s)-f(-s)}{2} + \frac{f(s)+f(-s)}{2} = f(s) \ . \end{aligned}$$

From this, we can conclude that f and \(F_0\) agree on at least \((1-2\delta )N/2\) pairs \(\{\pm s\}\): here, we use the fact that as \(B_0\) and \(B_1\) are closed under negation, so is \(B_0 \cap B_1\). Thus, the number of \(s \in S\) for which \(f(s) \ne F_0(s)\) or \(f(-s) \ne F_0(-s)\) is at most \(2\delta N\). Recalling \(\deg (F_0) < \rho N\), we conclude \(\varDelta _S(f) \le 2\delta \), as desired.

Thus, we have found that \(\varDelta _S(f) \le \delta \), which in particular means \(\delta _S(f) \le \delta \), as desired. Let F(X) denote the (necessarily unique) polynomial of degree \(<\rho N\) such that \(d_S(F(S),f(S)) \le \delta N\). As \(d_S(F_0(S),f(S)) \le 2\delta N\) it also follows that \(d_S(F_0(S),F(S)) \le 3\delta N < 1-\rho \). As \(F_0(S),F(S) \in {\text {RS}}[\mathbb {F},S,\rho ]\) and this code has minimum distance \(1-\rho \), it must be that \(F_0(S) = F(S)\), which further implies \(F_0(X) = F(X)\) (as polynomials).

We can therefore extract a polynomial of degree \(<\rho N\) that agrees with the function f(X) on a \((1-\delta )\) fraction of coordinates: namely, the polynomial \(F_0(X)\). Furthermore, since f differs from \(F_0\) outside of \(\bar{B} = B_0 \cup \ldots \cup B_N\) (by the choice of f), we can find a subset \(B \subseteq \bar{B}\) of size at least \((1-\delta )N\) for which \(f(s) = F_0(s)\) for all \(s \in B\). We may therefore output the following witness for \(\mathfrak {R}_\delta \): \((F_0(X),B,(\gamma _s)_{s \in B})\).      \(\square \)

We are now in position to apply the machinery developed in Sect. 6 to conclude the following bound on the knowledge error.

Corollary 1

(Knowledge Error of FRI-protocol (one folding iteration)). Let \(\rho = 2^{-r}\) for some \(r \in \{0,1,\dots ,m\}\) and let \(\delta \in (0,1)\) be such that \(\delta < \frac{1-\rho }{4}\). The FRI-Protocol is knowledge sound with respect to relation \(\mathfrak {R}_\delta \cup \mathfrak {R}_{\textsf {coll}}\) with knowledge error

$$\begin{aligned} \kappa &:= 1 - \left( 1-\frac{N}{|\mathbb {F}|}\right) \left( 1-\frac{\left( \lceil (1-\delta )N\rceil -1\right) ^t}{N^t}\right) \le \frac{N}{|\mathbb {F}|} + (1-\delta )^t \ . \end{aligned}$$

Proof

Theorem 3 shows that the FRI-Protocol is \((\varGamma _{N+1},\varGamma )\)-out-of-\((\mathbb {F},\mathcal {C})\) special-sound. To apply Theorem 2, we must first establish that \(t_{\varGamma } \cdot t_{\varGamma _{N+1}} \le N^{O(1)}\). And this is indeed the case, as

$$ t_{\varGamma } \le \lceil (1-\delta )N\rceil \text { and } t_{\varGamma _{N+1}} \le N+1 \ . $$

We now establish the knowledge error. For this, it suffices to note that \(\max _{S \notin \varGamma _{N+1}}\frac{|S|}{|\mathbb {F}|} = \frac{N}{|\mathbb {F}|}\) while

$$\begin{aligned} \max _{A \notin \varGamma }\frac{|A|}{|\mathcal {C}|} = \frac{\left( \lceil (1-\delta )N\rceil -1\right) ^t}{N^t} \le (1-\delta )^t \ . \end{aligned}$$

To see the first equality, first note that if \(A \notin \varGamma \) then \(\bigcup _{c \in A}B(c)\) has cardinality less than \((1-\delta )N\), so A the number of \(s \in S\) which can appear in a challenge \(c \in A\) is at most \((1-\delta )N\); as this is an integer, it is at most \(\lceil (1-\delta )N\rceil -1\). That is, for some subset \(T \subseteq S\) with \(|T| \le \lceil (1-\delta )N\rceil -1\), \(A \subseteq T^t\), and \(|T^t| \le ((1-\delta )N/2)^t\). The equality holds as we can certainly choose \(A = T^t\) for some \(T \subseteq S\) of size \(\lceil (1-\delta )N\rceil -1\). For the denominator, as \(\mathcal {C}= S^t\) it has cardinality \(|S|^t=N^t\).    \(\square \)

7.4 Additional Folding Iterations

The above analysis can naturally be extended to handle more folding iterations. Let \(F_0:=F\) be the low degree polynomial the prover commits to in the first round. We have folding rounds \(i=1,\dots ,\mu \), and in round i the verifier sends a challenge \(c_{i-1} \in \mathbb {F}\) and the prover provides a commitment to \(F_{i}(S^{\cdot 2^i})\) where \(F_{i}(X) = {\text {Fold}}(F_{i-1},c_{i-1})(X)\). After these folding iterations, the verifier picks t points \(s_1,\dots ,s_t \in S\) independently and uniformly at random and then checks that for all \(i=1,\dots ,\mu \) and \(j=1,\dots ,t\), we have

$$ F_i\big (s_j^{2^i}\big ) = \frac{F_{i-1}\big (s_j^{2^{i-1}}\big )+F_{i-1}\big (-s_j^{2^{i-1}}\big )}{2} + c_{i-1}\frac{F_{i-1}\big (s_j^{2^{i-1}}\big )-F_{i-1}\big (-s_j^{2^{i-1}}\big )}{2s_j} \ . $$

The recursive structure of the extractor implies that after \(\mu \) folding iterations we obtain a protocol with the following generalized special-soundness guarantee.

Theorem 4

(FRI-protocol (\(\mu \) folding iterations).). Let \(\rho = 2^{-r}\) for some \(r \in \{0,1,\dots ,m\}\) and let \(\delta \in (0,1)\) be such that \(\delta < \frac{1-\rho }{4}\). Let \(\mu \in \mathbb {N}\) be such that \(\mu \le \log _2N\), and for \(i=1,2,\dots ,\mu \) let \(N_i := N/2^{i-1}\). The FRI-protocol with \(\mu \) folding iterations is perfectly complete with respect to relation \(\mathfrak {R}_0\) and \((\varGamma _{N_1+1},\varGamma _{N_2+1},\dots ,\varGamma _{N_\mu +1},\varGamma )\)-out-of-\((\mathbb {F},\mathbb {F},\dots ,\mathbb {F}, \mathcal {C})\) special-sound with respect to relation \(\mathfrak {R}_\delta \cup \mathfrak {R}_{\textsf {coll}}\).

This yields the following corollary regarding the knowledge error. However, we note that for \(\mu = \varOmega (\log N)\) the knowledge extractor only runs in expected quasi-polynomial time, preventing us from being able to claim the standard notion of knowledge soundness. Nonetheless we believe that the guarantee is meaningful. For the proof, we refer to the full version [1].

Corollary 2

(Knowledge Error of FRI-protocol (\(\mu \) folding iterations)). Let \(N=2^n\) for some \(n \in \mathbb {N}\), let \(\rho = 2^{-r}\) for some \(r \in \{0,1,\dots ,m\}\) and let \(\delta \in (0,1)\) be such that \(\delta < \frac{1-\rho }{4}\). Let \(\mu \in \mathbb {N}\) be such that \(\mu \le \log _2N\), and for \(i=1,2,\dots ,\mu \) let \(N_i := N/2^{i-1}\). There exists a function \(q(N,\mu ) = N^{O(\mu )}\) such that the following holds.

There exists an extraction algorithm that, when given oracle access to a (potentially dishonest prover) \(\mathcal {P}^*\) and input x of size N for the FRI-protocol, runs in time \(\le q(N,\mu )\) and outputs a witness in the relation \(\mathfrak {R}_\delta \cup \mathfrak {R}_{\textsf {coll}}\) with probability at least

$$ \frac{\epsilon (\mathcal {P}^*,x) - \kappa (N,\mu )}{q(N,\mu )} $$

where

$$\begin{aligned} \kappa (N,\mu ) &:= 1 - \left( \prod _{i=1}^{\mu }\left( 1-\frac{N_i}{|\mathbb {F}|}\right) \right) \cdot \left( 1-\frac{\left( \lceil (1-\delta )N\rceil -1\right) ^t}{N^t}\right) \\ &\le \sum _{i=1}^{\mu }\frac{N_i}{|\mathbb {F}|} + (1-\delta )^t \le \frac{2N}{|\mathbb {F}|} + (1-\delta )^t \ . \end{aligned}$$