1 Introduction

Succinct non-interactive arguments (SNARGs) are efficient certificates of membership in non-deterministic languages. Recent years have seen a surge of interest in zero-knowledge SNARGs of knowledge (zkSNARKs), with researchers studying constructions under different cryptographic assumptions, improvements in asymptotic efficiency, concrete performance of implementations, and numerous applications. The focus of this paper is SNARGs in the preprocessing setting, a notion that we motivate next.

When is fast verification possible? The size of a SNARG must be, as a minimum condition, sublinear in the size of the non-deterministic witness, and often is required to be even smaller (e.g., logarithmic in the size of the non-deterministic computation). The time to verify a SNARG would be, ideally, as fast as reading the SNARG. This is in general too much to hope for, however. The verification procedure must also read the description of the computation, in order know what statement is being verified. While there are natural computations that have succinct descriptions (e.g., machine computations), in general the description of a computation could be as large as the computation itself, which means that the time to verify the SNARG could be asymptotically comparable to the size of the computation. This is unfortunate because there is a very useful class of computations for which we cannot expect fast verification: general circuit computations.

The preprocessing setting. An approach to avoid the above limitation is to design a verification procedure that has two phases: an offline phase that produces a short summary for a given circuit; and an online phase that uses this short summary to verify SNARGs that attest to the satisfiability of the circuit with different partial assignments to its input wires. Crucially, now the online phase could in principle be as fast as reading the SNARG (and the partial assignment), and thus sublinear in the circuit size. This goal was captured by preprocessing SNARGs [Gro10, Lip12, Gen+13, Bit+13], which have been studied in an influential line of works that has led to highly-efficient constructions that fulfill this goal (e.g., [Gro16]) and large-scale deployments in the real world that benefit from the online fast verification (e.g., [Zcash]).

The problem: circuit-specific SRS. The offline phase in efficient constructions of preprocessing SNARGS consists of sampling a structured reference string (SRS) that depends on the circuit that is being preprocessed. This implies that producing/validating proofs with respect to different circuits requires different SRSs. In many applications of interest, there is no single party that can be entrusted with sampling the SRS, and so real-world deployments have had to rely on cryptographic “ceremonies” [ZcashMPC] that use secure multi-party sampling protocols [Ben+15, BGG17, BGM17]. However, any modification in the circuit used in an application requires another cryptographic ceremony, which is unsustainable for many applications.

A solution: universal SRS. The above motivates preprocessing SNARGs where the SRS is universal, which means that the SRS supports any circuit up to a given size bound by enabling anyone, in an offline phase after the SRS is sampled, to publicly derive a circuit-specific SRS.Footnote 1 Known techniques to obtain a universal SRS from circuit-specific SRS introduce expensive overheads due to universal simulation [Ben+14a, Ben+14b]. Also, these techniques lead to universal SRSs that are not updatable, a property introduced in [Gro+18] that significantly simplifies cryptographic ceremonies. The recent work of Maller et al. [Mal+19] overcomes these shortcomings, obtaining the first efficient construction of a preprocessing SNARG with universal (and updatable) SRS. Even so, the construction in [Mal+19] is considerably more expensive than the state of the art for circuit-specific SRS [Gro16]. In this paper we ask: can the efficiency gap between universal SRS and circuit-specific SRS be closed, or at least significantly reduced?

Concurrent work. A concurrent work [GWC19] studies the same question as this paper. See Sect. 1.2 for a brief discussion that compares the two works.

1.1 Our Results

In this paper we present \(\textsc {Marlin}\), a new preprocessing zkSNARK with universal (and updatable) SRS that improves on the prior state of the art [Mal+19, Sonic] in essentially all relevant efficiency parameters.Footnote 2 In addition to reducing argument size by several group and field elements and reducing time complexity of the verifier by over \(3 \times \), our construction overcomes the main efficiency drawback of [Mal+19, Sonic]: the cost of producing proofs. Indeed, our construction improves time complexity of the prover by over \(10 \times \), achieving prover efficiency comparable to the case of preprocessing zkSNARKs with circuit-specific SRS. In Fig. 1 we provide a comparison of our construction and [Mal+19, Sonic], including argument sizes for two popular elliptic curves; the table also includes the state of the art for circuit-specific SRS. We have implemented \(\textsc {Marlin}\) in a Rust library,Footnote 3 and report evaluation results in Fig. 2.

Our zkSNARK is the result of several contributions that we deem of independent interest, summarized below.

(1) A new methodology. We present a general methodology to construct preprocessing SNARGs (and also zkSNARKs) where the SRS is universal (and updatable). Our methodology produces succinct interactive arguments that can be made non-interactive via the Fiat–Shamir transformation [FS86], and so below we focus on preprocessing arguments with universal and updatable SRS.

Our key observation is that the ability to preprocess a circuit in an offline phase is closely related to constructing “holographic proofs” [Bab+91], which means that the verifier does not receive the circuit description as an input but, rather, makes a small number of queries to an encoding of it. These queries are in addition to queries that the verifier makes to proofs sent by the prover. Moreover, in this paper we focus on the setting where the encoding of the circuit description consists of low-degree polynomials and also where proofs are themselves low-degree polynomials—this can be viewed as a requirement that honest and malicious provers are “algebraic”. We call these algebraic holographic proofs (AHPs); see Sect. 4 for definitions.

We present a transformation that “compiles” any public-coin AHP into a corresponding preprocessing argument with universal (and updatable) SRS by using suitable polynomial commitments.

Fig. 1.
figure 1

Comparison of two preprocessing zkSNARKs with universal (and updatable) SRS: the prior state of the art and our construction. We include the current state of the art for circuit-specific SRS (in gray), for reference. Here \(\mathbb {G}_1\)/\(\mathbb {G}_2\)/\(\mathbb {F}_{q}\) denote the number of elements or operations over the respective group/field; also, \(\text {f-MSM}(m)\) and \(\text {v-MSM}(m)\) denote fixed-base and variable-base multi-scalar multiplications (MSM) each of size m, respectively. The number of pairings that we report for Sonic’s verifier is lower than that reported in [Mal+19] because we account for standard batching techniques for pairing equations.

Fig. 2.
figure 2

Measured performance of \(\textsc {Marlin}\) and [Gro16] over the BLS12-381 curve. We could not include measurements for [Mal+19, Sonic] because at the time of writing there is no working implementation of its unhelped variant.

Theorem 1

There is an efficient transformation that combines any public-coin AHP for a relation \(\mathcal {R}\) and an extractable polynomial commitment scheme to obtain a public-coin preprocessing argument with universal SRS for the relation \(\mathcal {R}\). The transformation preserves zero knowledge and proof of knowledge of the underlying AHP. The SRS is updatable provided the SRS of the polynomial commitment scheme is.

The above transformation provides us with a methodology to construct preprocessing zkSNARKs with universal SRS (see Fig. 3). Namely, to improve the efficiency of preprocessing zkSNARKs with universal SRS it suffices to improve the efficiency of simpler building blocks: AHPs (an information-theoretic primitive) and polynomial commitments (a cryptographic primitive).Footnote 4

The improvements achieved by our preprocessing zkSNARK (see Fig. 1) were obtained by following this methodology: we designed efficient constructions for each of these two building blocks (which we discuss shortly), combined them via Theorem 1, and then applied the Fiat–Shamir transformation [FS86].

Methodologies that combine information-theoretic probabilistic proofs and cryptographic tools have played a fundamental role in the construction of efficient argument systems. In the particular setting of preprocessing SNARGs, for example, the compiler introduced in [Bit+13] for circuit-specific SRS has paved the way towards current state-of-the-art constructions [Gro16], and also led to constructions that are plausibly post-quantum [Bon+17, Bon+18]. We believe that our methodology for universal SRS will also be useful in future work, and may lead to further efficiency improvements.

Fig. 3.
figure 3

Our methodology for constructing preprocessing SNARGs with universal SRS.

(2) An efficient AHP for R1CS. We design an algebraic holographic proof (AHP) that achieves linear proof length and constant query complexity, among other useful efficiency features. The protocol is for rank-1 constraint satisfiability (R1CS), a well-known generalization of arithmetic circuits where the “circuit description” is given by coefficient matrices (see definition below). Note that the relations that we consider consist of triples rather than pairs, because we need to split the verifier’s input into a part for the offline phase and a part for the online phase. The offline input is called the index, and it consists of the coefficient matrices; the online input is called the instance, and it consists of a partial assignment to the variables. The algorithm that encodes the index (coefficient matrices) in the offline phase is called the indexer.

Definition 1

(informal). The indexed relation \(\mathcal {R}_{\mathrm {R1CS}}\) is the set of triples where \(\mathbb {F}\) is a finite field, \({A},{B},{C}\) are \({n}\times {n}\) matrices over \(\mathbb {F}\), each containing at most \({m}\) non-zero entries, and \({z}:= (x, w)\) is a vector in \(\mathbb {F}^{{n}}\) such that \({A}{z}\circ {B}{z}= {C}{z}\). (Here “\(\circ \)” denotes the entry-wise product.)

Theorem 2

(informal). There exists a constant-round AHP for the indexed relation \(\mathcal {R}_{\mathrm {R1CS}}\) with linear proof length and constant query complexity. The soundness error is \(O({m}/|\mathbb {F}|)\), and the construction is a zero knowledge proof of knowledge. The arithmetic complexity of the indexer is \(O({m}\log {m})\), of the prover is \(O({m}\log {m})\), and of the verifier is \(O(|x|+\log {m})\).

The literature on probabilistic proofs contains algebraic protocols that are holographic (e.g., [Bab+91] and [GKR15]) but none achieve constant query complexity, and so applying our methodology (Theorem 1) to these would lead to large argument sizes (many tens of kilobytes). These prior algebraic protocols rely on the multivariate sumcheck protocol applied to certain multivariate polynomials, which means that they incur sizable communication costs due to (a) the many rounds of the sumcheck protocol, and (b) the fact that applying the methodology would involve using multivariate polynomial commitment schemes that (for known constructions) lead to communication costs that are linear in the number of variables.

In contrast, our algebraic protocol relies on univariate polynomials and achieves constant query complexity, incurring small communication costs. Our algebraic protocol can be viewed as a “holographic variant” of the algebraic protocol for R1CS used in Aurora [Ben+19a], because it achieves an exponential improvement in verification time when the verifier is given a suitable encoding of the coefficient matrices.

(3) Extractable polynomial commitments. Polynomial commitment schemes, introduced in [KZG10], are commitment schemes specialized to work with univariate polynomials. The security properties in [KZG10], while sufficient for the applications therein, do not appear sufficient for standalone use, or even just for the transformation in Theorem 1. We propose a definition for polynomial commitment schemes that incorporates the functionality and security that we believe to suffice for standalone use (and in particular suffices for Theorem 1). Moreover, we show how to extend the construction of [KZG10] to fulfill this definition in the plain model under non-falsifiable knowledge assumptions, or via a more efficient construction in the algebraic group model [FKL18] under falsifiable assumptions. These constructions are of independent interest, and when combined with our transformation, lead to the first efficient preprocessing arguments with universal SRS under concrete knowledge assumptions, and also to the efficiency reported in Fig. 1.

We have implemented in a Rust libraryFootnote 5 the polynomial commitment schemes, and our implementation of \(\textsc {Marlin}\) relies on this library. We deem this library of independent interest for other projects.

1.2 Related Work

In this paper we study the goal of constructing preprocessing SNARGs with universal SRS, which achieve succinct verification regardless of the structure of the non-deterministic computation being checked. The most relevant prior work is Sonic [Mal+19], on which we improve as already discussed (see Fig. 1). The notion of updatable SRS was defined and achieved in [Gro+18], but with a less efficient construction.

Concurrent work. A concurrent work [GWC19] studies the same question as this paper, and also obtains efficiency improvements over Sonic [Mal+19]. Below is a brief comparison.

  • We provide an implementation and evaluation of our construction, while [GWC19] do not. The estimated costs reported in [GWC19] suggest that an implementation may perform similarly to ours.

  • Similarly to our work, [GWC19] extends the polynomial commitment in [KZG10] to support batching, and proves the extension secure in the algebraic group model. We additionally show how to prove security in the plain model under non-falsifiable knowledge assumptions, and consider the problem of enforcing different degrees for different polynomials (a feature that is not needed in [GWC19]).

  • We show how to compile any algebraic holographic proof into a preprocessing argument with universal SRS, while [GWC19] focus on compiling a more restricted notion that they call “polynomial protocols”.

  • Our protocol natively supports R1CS, and can be viewed as a holographic variant of the algebraic protocol in [Ben+19a]. The protocol in [GWC19] natively supports a different constraint system, and involves a protocol that, similar to [Gro10], uses a permutation argument to attest that all variables in the same cycle of a permutation are equal (e.g., (1)(2, 3)(4) would require that the second and third entries are equal).

Preprocessing SNARGs with a URS. Setty [Set19] studies preprocessing SNARGs with a URS (uniform reference string), and describes a protocol that for n-gate arithmetic circuits and a chosen constant \(c \ge 2\) achieves proving time \(O_{\lambda }(n)\), argument size \(O_{\lambda }(n^{1/c})\), and verification time \(O_{\lambda }(n^{1-1/c})\). The protocol in [Set19] offers a tradeoff compared to our work: preprocessing with a URS instead of a SRS, at the cost of asymptotically larger argument size and verification time. The question of achieving processing with a URS while also achieving asymptotically small argument size and verification time remains open.

The protocol in [Set19] is obtained by combining the multivariate polynomial commitments of [Wah+18] and a modern rendition of the PCP in [Bab+91] (which itself can be viewed as the “bare bones” protocol of [GKR15] for circuits of depth 1). [Set19] lacks an analysis of concrete costs, and also does not discuss how to achieve zero knowledge beyond stating that techniques in other papers [Zha+17a, Wah+18, Xie+19] can be applied. Nevertheless, argument sizes would at best be similar to these other papers (tens of kilobytes), which is much larger than our argument sizes (in the SRS model).

We conclude by noting that the informal security proof in [Set19] appears insufficient to show soundness of the argument system, because the polynomial commitment scheme is only assumed to be binding but not also extractable (there is no explanation of where the witness encoded in the committed polynomial comes from). Our definitions and security proofs, if ported over to the multivariate setting, would fill this gap.

Non-preprocessing SNARGs for arbitrary computations. Checking arbitrary circuits without preprocessing them requires the verifier to read the circuit, so the main goal is to obtain small argument size. In this setting of non-preprocessing SNARGs for arbitrary circuits, constructions with a URS (uniform reference string) are based on discrete logarithms [Boo+16, Bün+18] or hash functions [Ame+17, Ben+19a], while constructions with a universal SRS (structured reference string) combine polynomial commitments and non-holographic algebraic proofs [Gab19]; all use random oracles to obtain non-interactive arguments.Footnote 6

We find it interesting to remark that our methodology from Theorem 1 generalizes protocols such as [Gab19] in two ways. First, it formalizes the folklore approach of combining polynomial commitments and algebraic proofs to obtain arguments, identifying the security properties required to make this approach work. Second, it demonstrates how for algebraic holographic proofs the resulting argument enables preprocessing.

Non-preprocessing SNARGs for structured computations. Several works study SNARGs for structured computations. This structure enables fast verification without preprocessing. A line of works [Ben+17, Ben+19c, Ben+19b] combines hash functions and various interactive oracle proofs. Another line of works [Zha+17b, Zha+18, Zha+17a, Wah+18, Xie+19] combines multivariate polynomial commitments [PST13] and doubly-efficient interactive proofs [GKR15].

While in this paper we study a different setting (preprocessing SNARGs for arbitrary computations), there are similarities, and notable differences, in the polynomial commitments used in our work and prior works. We begin by noting that the notion of “multivariate polynomial commitments” varies considerably across prior works, despite the fact that most of those commitments are based on the protocol introduced in [PST13].

  • The commitments used in [Zha+17b, Zha+18] are required to satisfy extractability (a stronger notion than binding) because the security proof of the argument system involves extracting a polynomial encoding a witness. The commitment is a modification of [PST13] that uses knowledge commitments, a standard ingredient to achieve extractability under non-falsifiable assumptions in the plain model. Neither of these works consider hiding commitments as zero knowledge is not a goal for them.

  • The commitments used in [Zha+17a, Wah+18] must be compatible with the Cramer–Damgård transform [CD98] used in constructing the argument system. They consider a modified setting where the sender does not reveal the value of the commitment polynomial at a desired point but, instead, reveals a commitment to this value, along with a proof attesting that the committed value is correct. For this modified setting, they consider commitments that satisfy natural notions of extractability and hiding (achieving zero knowledge arguments is a goal in both papers). The commitments constructed in the two papers offer different tradeoffs. The commitment in [Zha+17a] is based on [PST13]: it relies on a SRS (structured reference string); it uses pairings; and for \(\ell \)-variate polynomials achieves \(O_{\lambda }(\ell )\)-size arguments that can be checked in \(O_{\lambda }(\ell )\) time. The commitment in [Wah+18] is inspired from [BG12] and [Bün+18]: it relies on a URS (uniform reference string); it does not use pairings; and for \(\ell \)-variate multilinear polynomials and a given constant \(c \ge 2\) achieves \(O_{\lambda }(2^{\ell /c})\)-size arguments that can be checked in \(O_{\lambda }(2^{\ell -\ell /c})\) time.

  • The commitments used in [Xie+19] are intended for the regular (unmodified) setting of commitment schemes where the sender reveals the value of the polynomial, because zero knowledge is later achieved by building on the algebraic techniques described in [CFS17]. The commitment definition in [Xie+19] considers binding and hiding, but not extractability. However, the given security analysis for the argument system does not seem to go through for this definition (there is no explanation of where the witness encoded in the committed polynomial comes from). Also, no commitment construction is provided in [Xie+19], and instead the reader is referred to [Zha+17a], which considers the modified setting described above.

In sum there are multiple notions of commitment and one must be precise about the functionality and security needed to construct an argument system. We now compare prior notions of commitments to the one that we use.

First, since in this paper we do not use the Cramer–Damgård transform for zero knowledge, commitments in the modified setting are not relevant. Instead, we achieve zero knowledge via bounded independence [Ben+16], and in particular we consider the familiar setting where the sender reveals evaluations to the committed polynomial. Second, prior works consider protocols where the sender commits to a polynomial in a single round, while we consider protocols where the sender commits to multiple polynomials of different degrees in each of several rounds. This multi-polynomial multi-round setting requires suitable extensions in terms of functionality (to enable batching techniques to save on argument size) and security (extractability and hiding need to be strengthened), which means that prior definitions do not suffice for us.

The above discrepancies have led us to formulate new definitions of functionality and security for polynomial commitments (as summarized in Sect. 2.2). We conclude by noting that, since in this paper we construct arguments that use univariate polynomials, our definitions are specialized to commitments for univariate polynomials. Corresponding definitions for multivariate polynomials can be obtained with straightforward modifications, and would strengthen definitions appearing in some prior works. Similarly, we fulfill the required definitions via natural adaptations of the univariate scheme of [KZG10], and analogous adaptations of the multivariate scheme of [PST13] would fulfill the multivariate analogues of these definitions.

2 Techniques

We discuss the main ideas behind our results. First we describe the two building blocks used in Theorem 1: AHPs and polynomial commitment schemes (described in Sects. 2.1 and 2.2 respectively). We describe how to combine these to obtain preprocessing arguments with universal SRS in Sect. 2.3. Next, we discuss constructions for these building blocks: in Sect. 2.4 we describe our AHP (underlying Theorem 2), and in Sect. 2.5 we describe our construction of polynomial commitments.

Throughout, instead of considering the usual notion of relations that consist of instance-witness pairs, we consider indexed relations, which consist of triples where is the index, \(\mathbbm {x}\) is the instance, and \(\mathbbm {w}\) is the witness. This is because represents the part of the verifier input that is preprocessed in the offline phase (e.g., the circuit description) and \(\mathbbm {x}\) represents the part of the verifier input that comes in the online phase (e.g., a partial assignment to the circuit’s input wires). The indexed language corresponding to an indexed relation \(\mathcal {R}\), denoted \(\mathcal {L}(\mathcal {R})\), is the set of pairs for which there exists a witness \(\mathbbm {w}\) such that .

2.1 Building Block: Algebraic Holographic Proofs

Interactive oracle proofs (IOPs) [BCS16, RRR16] are multi-round protocols where in each round the verifier sends a challenge and the prover sends an oracle (which the verifier can query). IOPs combine features of interactive proofs and probabilistically checkable proofs. Algebraic holographic proofs (AHPs) modify the notion of an IOP in two ways.

  • Holographic: the verifier does not receive its input explicitly but, rather, has oracle access to a prescribed encoding of it. This potentially enables the verifier to run in time that is much faster than the time to read its input in full. (Our constructions will achieve this fast verification.)

  • Algebraic: the honest prover must produce oracles that are low-degree polynomials (this restricts the completeness property), and all malicious provers must produce oracles that are low-degree polynomials (this relaxes the soundness property). The encoded input to the verifier must also be a low-degree polynomial.

Since in this paper we only work with univariate polynomials, our definitions focus on this case, but they can be modified in a straightforward way to be more general.

Informally, a (public-coin) AHP over a field \(\mathbb {F}\) for an indexed relation \(\mathcal {R}\) is specified by an indexer \(\mathbf {I}\), prover \(\mathbf {P}\), and verifier \(\mathbf {V}\) that work as follows.

  • Offline phase. The indexer \(\mathbf {I}\) receives as input the index to be preprocessed, and outputs one or more univariate polynomials over \(\mathbb {F}\) encoding .

  • Online phase. For some instance \(\mathbbm {x}\) and witness \(\mathbbm {w}\), the prover \(\mathbf {P}\) receives and the verifier \(\mathbf {V}\) receives \(\mathbbm {x}\); \(\mathbf {P}\) and \(\mathbf {V}\) interact over some (in this paper, constant) number of rounds, where in each round \(\mathbf {V}\) sends a challenge and \(\mathbf {P}\) sends one or more polynomials; after the interaction, \(\mathbf {V}(\mathbbm {x})\) probabilistically queries the polynomials output by the indexer and the polynomials output by the prover, and then accepts or rejects. Crucially, \(\mathbf {V}\) does not receive as input, but instead queries the polynomials output by \(\mathbf {I}\) that encode . This enables the construction of verifiers \(\mathbf {V}\) that run in time that is sublinear in .

The completeness property states that for every the probability that convinces to accept is 1. The soundness property states that for every and admissible prover \(\tilde{\mathbf {P}}\) the probability that \(\tilde{\mathbf {P}}\) convinces to accept is at most a given soundness error \(\epsilon \). A prover is “admissible” if the degrees of the polynomials it outputs fit within prescribed degree bounds of the protocol. See Sect. 4 for details on AHPs.

2.2 Building Block: Polynomial Commitments

Informally, a polynomial commitment scheme [KZG10] allows a prover to produce a commitment \(c\) to a univariate polynomial \(p\in \mathbb {F}[{X}]\), and later “open” \(p({X})\) at any point \(z\in \mathbb {F}\), producing an evaluation proof \(\pi \) showing that the opened value is consistent with the polynomial “inside” \(c\) at \(z\). Turning this informal goal into a useful definition requires some care, however, as we explain below. In this paper we propose a set of definitions for polynomial commitment schemes that we believe are useful for standalone use, and in particular suffice as a building block for our compiler described in Sect. 2.3.

First, we consider constructions with strong efficiency requirements: the commitment \(c\) is much smaller than the polynomial \(p\) (e.g., \(c\) consists of a constant number of group elements), and the proof \(\pi \) can be validated very fast (e.g., in a constant number of cryptographic operations). These requirements not only rule out natural constructions, but also imply that the usual binding property, which states that an efficient adversary cannot open the same commitment to two different values, does not capture the desired security. Indeed, even if the adversary were to be bound to opening values of some function \(f :\mathbb {F}\rightarrow \mathbb {F}\), it may be that the function f is consistent with a polynomial whose degree is higher than what was claimed. This means that a security definition needs to incorporate guarantees about the degree of the committed function.Footnote 7

Second, in many applications of polynomial commitments, an adversary produces multiple commitments to polynomials within a round of interaction and across rounds of interaction. After this interaction, the adversary reveals values of all of these polynomials at one or more locations. This setting motivates a number of considerations. First, it is desirable to rely on a single set of public parameters for committing to multiple polynomials, even if the polynomials differ in degree. A construction such as that of [KZG10] can be modified in a natural way to achieve this is by committing both to the polynomial and its shift to the maximum degree, similarly to techniques used to bundle multiple low-degree tests into a single one [Ben+19a]. This modification needs to be addressed in any proof of security. Second, it would be desirable to batch evaluation proofs across different polynomials for the same location. Again, the construction in [KZG10] can support this, but one must argue that security still holds in this more general case.

The preceeding considerations require an extension of previous definitions and motivate our re-formulation of the primitive. Informally, a polynomial commitment scheme \(\mathsf {PC}\) is a tuple of algorithms . The setup algorithm takes as input a security parameter and maximum supported degree bound \(D\), and outputs public parameters \(\mathsf {pp}\) that contain the description of a finite field \(\mathbb {F}\). The “trimming” algorithm then deterministically specializes these parameters for a given set of degree bounds and outputs a committer key \(\mathsf {ck}\) and a receiver key \(\mathsf {rk}\). The sender can then invoke with input \(\mathsf {ck}\) and a list of polynomials \(\varvec{p}\) with respective degree bounds \(\varvec{d}\) to generate a set of commitments \(\varvec{c}\). Subsequently, the sender can use to produce a proof \(\pi \) that convinces the receiver that the polynomials “inside” \(\varvec{c}\) respect the degree bounds \(\varvec{d}\) and, moreover, evaluate to the claimed set of values \(\varvec{v}\) at a given query set \(Q\) that specifies any number of evaluation points for each polynomial. The receiver can invoke to check this proof.

The scheme \(\mathsf {PC}\) is required to satisfy extractability and efficiency properties, and also, optionally, a hiding property. We outline these properties below, and provide detailed definitions in the full version.

Extractability. Consider an efficient sender adversary \(\mathcal {A}\) that can produce a commitment \(c\) and degree bound \(d\le D\) such that, when asked for an evaluation at some point \(z\in \mathbb {F}\), can produce a supposed evaluation \(v\) and proof \(\pi \) such that accepts. Then \(\mathsf {PC}\) is extractable if for every maximum degree bound \(D\) and every sender adversary \(\mathcal {A}\) who can produce such commitments, there exists a corresponding efficient extractor \(\mathcal {E}_\mathcal {A}\) that outputs a polynomial \(p\) of degree at most \(d\) that “explains” \(c\) so that \(p(z) = v\). While for simplicity we have described the most basic case here, our definition considers adversaries and extractors who interact over multiple rounds, wherein the adversary may produce multiple commitments in each round and the extractor is required to output corresponding polynomials on a per-round basis (before seeing the query set, proof, or supposed evaluations).

In this work we rely on extractability to prove the security of our compiler (see Sect. 2.3); we do not know if weaker security notions studied in prior works, such as evaluation binding, suffice. More generally, we believe that extractability is a useful property that may be required across a range of other applications.

Efficiency. We require two notions of efficiency for \(\mathsf {PC}\). First, the time required to commit to a polynomial \(p\) and then to create an evaluation proof must be proportional to the degree of \(p\), and not to the maximum degree \(D\). (This ensures that the argument prover runs in time proportional to the size of the index.)

On the receiver’s side, the commitment size, proof size, and time to verify an opening must be independent of the claimed degrees for the polynomials. (This ensures that the argument produced by our compiler is succinct.)

Hiding. The hiding property of \(\mathsf {PC}\) states that commitments and proofs of evaluation reveal no information about the committed polynomial beyond the publicly stated degree bound and the evaluation itself. Namely, \(\mathsf {PC}\) is hiding if there exists an efficient simulator that outputs simulated commitments and simulated evaluation proofs that cannot be distinguished from their real counterparts by any malicious distinguisher that only knows the degree bound and the evaluation.

Analogously to the case of extractability, we actually consider a more general definition that considers commitments to multiple polynomials within and across multiple rounds; moreover, the definition considers the case where some polynomials are designated as not hidden (and thus given to the simulator) because in our application we sometimes prefer to commit to a polynomial in a non-hiding way (for efficiency reasons).

2.3 Compiler: From AHPs to Preprocessing Arguments with Universal SRS

We describe the main ideas behind Theorem 1, which uses polynomial commitment schemes to compile any (public-coin) AHP into a corresponding (public-coin) preprocessing argument with universal SRS. In a subsequent step, the argument can be made non-interactive via the Fiat–Shamir transformation, and thereby obtain a preprocessing SNARG with universal SRS.

The basic intuition of the compiler follows the well-known framework of “commit to oracles and then open query answers” pioneered by Kilian [Kil92]. However, the commitment scheme used in our compiler leverages and enforces the algebraic structure of these oracles. While several works in the literature already take advantage of algebraic commitment schemes applied to algebraic oracles, our contribution is to observe that if we apply this framework to a holographic proof then we obtain a preprocessing argument.

Informally, first the argument indexer invokes the AHP indexer to generate polynomials, and then deterministically commits to these using the polynomial commitment scheme. Subsequently, the argument prover and argument verifier interact, each respectively simulating the AHP prover and AHP verifier. In each round, the argument prover sends succinct commitments to the polynomials output by the AHP prover in that round. After the interaction, the argument verifier declares its queries to the polynomials (of the prover and of the indexer). The argument prover replies with the desired evaluations along with an evaluation proof attesting to their correctness relative to the commitments.

This approach, while intuitive, must be proven secure. In particular, in the proof of soundness, we need to show that if the argument prover convinces the argument verifier with a certain probability, then we can find an AHP prover that convinces the AHP verifier with similar probability. This step is non-trivial: the AHP prover outputs polynomials, while the argument prover merely outputs succinct commitments and a few evaluations, which is much less information. In order to deduce the former from the latter requires extraction. This motivates considering polynomial commitment schemes that are extractable, in the sense described in Sect. 2.2. We do not know whether weaker security properties, such as the evaluation binding property studied in some prior works, suffice for proving the compiler secure.

The compiler outlined above is compatible with the properties of argument of knowledge and zero knowledge. Specifically, we prove that if the AHP is a proof of knowledge, then the compiler produces an argument of knowledge; also, if the AHP is (bounded-query) zero knowledge and the polynomial commitment scheme is hiding, then the compiler produces a zero knowledge argument.

See the full version for more details on the compiler.

2.4 Construction: An AHP for Constraint Systems

In prior sections we have described how we can use polynomial commitment schemes to compile AHPs into corresponding preprocessing SNARGs. In this section we discuss the main ideas behind Theorem 2, which provides an efficient AHP for the indexed relation corresponding to R1CS (see Definition 1). The preprocessing zkSNARK that we achieve in this paper (see Fig. 1) is based on this AHP.

Our protocol can be viewed as a “holographic variant” of the non-holographic algebraic proof for R1CS constructed in [Ben+19a]. Achieving holography involves designing a new sub-protocol that enables the verifier to evaluate low-degree extensions of the coefficient matrices at a random location. While in [Ben+19a] the verifier performed this computation in time on its own, in our protocol the verifier performs it exponentially faster, in time , by receiving help from the prover and having oracle access to the polynomials produced by the indexer. We introduce notation and then discuss the protocol.

Some notation. Consider an index specifying coefficient matrices, an instance \(\mathbbm {x}= x\in \mathbb {F}^{*}\) specifying a partial assignment to the variables, and a witness \(\mathbbm {w}= w\in \mathbb {F}^{*}\) specifying an assignment to the other variables such that the R1CS equation holds. The R1CS equation holds if and only if \({A}{z}\circ {B}{z}= {C}{z}\) for \({z}:= (x, w) \in \mathbb {F}^{{n}}\). Below, we let \({H}\) and \({K}\) be prescribed subsets of \(\mathbb {F}\) of sizes \({n}\) and \({m}\) respectively; we also let \(v_{{H}}({X})\) and \(v_{{K}}({X})\) be the vanishing polynomials of these two sets. (The vanishing polynomial of a set S is the monic polynomial of degree |S| that vanishes on S, i.e., \(\prod _{\gamma \in S}({X}-\gamma )\).) We assume that both \({H}\) and \({K}\) are smooth multiplicative subgroups. This allows interpolation/evaluation over \({H}\) in \(O({n}\log {n})\) operations and also makes \(v_{{H}}({X})\) computable in \(O(\log {n})\) operations (and similarly for \({K}\)). Given an \({n}\times {n}\) matrix M with rows/columns indexed by elements of \({H}\), we denote by \(\hat{{M}}({X},{Y})\) the low-degree extension of \({M}\), i.e., the polynomial of individual degree less than \({n}\) such that \(\hat{{M}}({\kappa },\iota )\) is the \(({\kappa },\iota )\)-th entry of \({M}\) for every \({\kappa },\iota \in {H}\).

A non-holographic starting point. We sketch a non-holographic protocol for R1CS with linear proof length and constant query complexity, inspired from [Ben+19a], that forms the starting point of our work. In this case the prover receives as input and the verifier receives as input . (The verifier reads the non-encoded index because we are describing a non-holographic protocol.)

In the first message the prover \(\mathbf {P}\) sends the univariate polynomial \(\hat{{z}}({X})\) of degree less than \({n}\) that agrees with the variable assignment \({z}\) on \({H}\), and also sends the univariate polynomials \(\hat{{z}}_{{A}}({X}), \hat{{z}}_{{B}}({X}), \hat{{z}}_{{C}}({X})\) of degree less than \({n}\) that agree with the linear combinations \({z}_{{A}}:= {A}{z}\), \({z}_{{B}}:= {B}{z}\), and \({z}_{{C}}:= {C}{z}\) on \({H}\). The prover is left to convince the verifier that the following two conditions hold:

$$\begin{aligned}&\text {(1) Entry-wise product:}\quad \forall \,{\kappa }\in {H}\,,\; \hat{{z}}_{{A}}({\kappa })\hat{{z}}_{{B}}({\kappa }) - \hat{{z}}_{{C}}({\kappa }) = 0. \\&\text {(2) Linear relation:}\quad \forall \,M \in \{{A},{B},{C}\}\,,\; \forall \,{\kappa }\in {H}\,,\; \hat{{z}}_{M}({\kappa }) = \sum _{\iota \in {H}} M[{\kappa },\iota ]\hat{{z}}(\iota ). \end{aligned}$$

(The prover also needs to convince the verifier that \(\hat{{z}}({X})\) encodes a full assignment \({z}\) that is consistent with the partial assignment \(x\), but we for simplicity we ignore this in this informal discussion.)

In order to convince the verifier of the first (entry-wise product) condition, the prover sends the polynomial \(h_0({X})\) such that \(\hat{{z}}_{{A}}({X}) \hat{{z}}_{{B}}({X}) - \hat{{z}}_{{C}}({X}) = h_0({X}) v_{{H}}({X})\). This polynomial equation is equivalent to the first condition (the left-hand side equals zero everywhere on \({H}\) if and only if it is a multiple of \({H}\)’s vanishing polynomial). The verifier will check the equation at a random point \(\beta \in \mathbb {F}\): it queries \(\hat{{z}}_{{A}}({X}) ,\hat{{z}}_{{B}}({X}),\hat{{z}}_{{C}}({X}),h_0({X})\) at \(\beta \), evaluates \(v_{{H}}({X})\) at \(\beta \) on its own, and checks that \(\hat{{z}}_{{A}}(\beta ) \hat{{z}}_{{B}}(\beta ) - \hat{{z}}_{{C}}(\beta ) = h_0(\beta ) v_{{H}}(\beta )\). The soundness error is the maximum degree over the field size, which is at most \(2{n}/|\mathbb {F}|\).

In order to convince the verifier of the second (linear relation) condition, the prover expects a random challenge \(\alpha \in \mathbb {F}\) from the verifier, and then replies in a second message. For each \(M \in \{{A},{B},{C}\}\), the prover sends polynomials \(h_M({X})\) and \(g_M({X})\) such that

$$\begin{aligned} \begin{array}{l} {r}(\alpha , {X}) \hat{{z}}_{M}({X}) - {r}_{{M}}(\alpha , {X}) \hat{{z}}({X}) = h_M({X}) v_{{H}}({X}) + {X}g_M({X}) \\ \quad \text {for} \quad {r}_{{M}}(Z, {X}) := \mathop {\sum }\nolimits _{{\kappa }\in {H}} {r}(Z, {\kappa }) \hat{{M}}({\kappa }, X) \end{array} \end{aligned}$$

where \({r}(Z, {X})\) is a prescribed polynomial of individual degree less than \({n}\) such that \(({r}(Z, {\kappa }))_{{\kappa }\in {H}}\) are \({n}\) linearly independent polynomials. Prior work [Ben+19a] on checking linear relations via univariate sumchecks shows that this polynomial equation is equivalent, up to a soundness error of \({n}/|\mathbb {F}|\) over \(\alpha \), to the second condition.Footnote 8 The verifier will check this polynomial equation at the random point \(\beta \in \mathbb {F}\): it queries \(\hat{{z}}({X}),\hat{{z}}_{{A}}({X}) ,\hat{{z}}_{{B}}({X}),\hat{{z}}_{{C}}({X}),h_M({X}),g_M({X})\) at \(\beta \), evaluates \(v_{{H}}({X})\) at \(\beta \) on its own, evaluates \({r}(Z,{X})\) and \({r}_{{M}}(Z,{X})\) at \((\alpha , \beta )\) on its own, and checks that \({r}(\alpha , \beta ) \hat{{z}}_{M}(\beta )-{r}_{{M}}(\alpha , \beta ) \hat{{z}}(\beta ) = h_M(\beta ) v_{{H}}(\beta ) + \beta g_M(\beta )\). The additional soundness error is \(2{n}/|\mathbb {F}|\).

The above is a simple 3-message protocol for R1CS with soundness error \(\max \{2{n}/|\mathbb {F}|,3{n}/|\mathbb {F}|\}=3{n}/|\mathbb {F}|\) in the setting where the honest prover and malicious provers send polynomials of prescribed degrees, which the verifier can query at any location. The proof length (sum of all degrees) is linear in \({n}\) and the query complexity is constant.

Barrier to holography. The verifier in the above protocol runs in time that is . While this is inherent in the non-holographic setting (because the verifier must read ), we now discuss how exactly the verifier’s computation depends on . We shall later use this understanding to achieve an exponential improvement in the verifier’s time when given a suitable encoding of .

The verifier’s check for the entry-wise product is \(\hat{{z}}_{{A}}(\beta ) \hat{{z}}_{{B}}(\beta ) - \hat{{z}}_{{C}}(\beta ) = h_0(\beta ) v_{{H}}(\beta )\), and can be carried out in \(O(\log {n})\) operations regardless of the coefficient matrices contained in the index . In other words, this check is efficient even in the non-holographic setting. However, the verifier’s check for the linear relation is \({r}(\alpha , \beta ) \hat{{z}}_{M}(\beta )-{r}_{{M}}(\alpha , \beta ) \hat{{z}}(\beta ) = h_M(\beta ) v_{{H}}(\beta ) + \beta g_M(\beta )\), which has a linear cost. Concretely, evaluating the polynomial \({r}_{{M}}(Z,{X})\) at \((\alpha , \beta )\) requires \(\varOmega ({n}+ {m})\) operations.

In the holographic setting, a natural idea to reduce this cost would be to grant the verifier oracle access to the low-degree extension \(\hat{{M}}\) for \(M \in \{{A},{B},{C}\}\). This idea has two problems: the verifier still needs \(\varOmega ({n})\) operations to evaluate \({r}_{{M}}(Z,{X})\) at \((\alpha , \beta )\) and, moreover, the size of \(\hat{{M}}\) is quadratic in \({n}\), which means that the encoding of the index is \(\varOmega ({n}^2)\). We cannot afford such an expensive encoding in the offline preprocessing phase. We now describe how we overcome both of these problems, and obtain a holographic protocol.

Achieving holography. To overcome the above problems and obtain a holographic protocol, we rely yet again on the univariate sumcheck protocol. We introduce two additional rounds of interaction, and in each round the verifier learns that their verification equation holds provided the sumcheck from the next round holds. The last sumcheck will rely on polynomials output by the indexer, which the verifier knows are correct.

We address the first problem by letting the prover and verifier interact in an additional round, where we rely on an additional univariate sumcheck to reduce the problem of evaluating \({r}_{{M}}(Z,{X})\) at \((\alpha , \beta )\) to the problem of evaluating \(\hat{{M}}\) at \((\beta _2, \beta )\) for a random \(\beta _2 \in \mathbb {F}\). Namely, the verifier sends \(\beta \) to the prover, who computes

$$ \sigma _2 := {r}_{{M}}(\alpha , \beta ) = \sum _{{\kappa }\in {H}} {r}(\alpha , {\kappa }) \hat{{M}}({\kappa }, \beta ). $$

Then the prover replies with \(\sigma _2\) and the polynomials \(h_2({X})\) and \(g_2({X})\) such that

$$\begin{aligned} {r}(\alpha , {X}) \hat{{M}}({X}, \beta ) = h_2({X})v_{{H}}({X}) + {X}g_2({X}) + \sigma _2/{n}. \end{aligned}$$

Prior techniques on univariate sumcheck [Ben+19a] tell us that this equation is equivalent to the polynomial \({r}(\alpha , {X}) \hat{{M}}({X}, \beta )\) summing to \(\sigma _2\) on \({H}\). Thus the verifier needs to check this equation at a random \(\beta _2 \in \mathbb {F}\): \({r}(\alpha , \beta _2) \hat{{M}}(\beta _2, \beta ) = h_2(\beta _2)v_{{H}}(\beta _2) + \beta _2 g_2(\beta _2) + \sigma _2/{n}\). The only expensive part of this equation for the verifier is computing the value \(\hat{{M}}(\beta _2, \beta )\), which is problematic. Indeed, we have already noted that we cannot afford to simply let the verifier have oracle access to \(\hat{{M}}\), because this polynomial has quadratic size (it contains a quadratic number of terms).

We address this second problem as follows. Let \(u_{{H}}({X},{Y}) := \frac{v_{{H}}({X}) - v_{{H}}({Y})}{{X}- {Y}}\) be the formal derivative of the vanishing polynomial \(v_{{H}}({X})\), and note that \(u_{{H}}({X},{Y})\) vanishes on the square \({H}\times {H}\) except for on the diagonal, where it takes on the (non-zero) values \((u_{{H}}(a,a))_{a \in {H}}\). Moreover, \(u_{{H}}({X},{Y})\) can be evaluated at any point in \(\mathbb {F}\times \mathbb {F}\) in \(O(\log {n})\) operations. Using this polynomial, we can write \(\hat{{M}}\) as a sum of \({m}= |{K}|\) terms instead of \({n}^2 = |{H}|^2\) terms:

$$\begin{aligned} \hat{{M}}({X}, {Y}) := \sum _{{\kappa }\in {K}} u_{{H}}({X}, \hat{{\mathsf {row}}}_{M}({\kappa })) \cdot u_{{H}}({Y}, \hat{{\mathsf {col}}}_{M}({\kappa })) \cdot \hat{{\mathsf {val}}}_{M}({\kappa }), \end{aligned}$$

where \(\hat{{\mathsf {row}}}_{M},\hat{{\mathsf {col}}}_{M},\hat{{\mathsf {val}}}_{M}\) are the low-degree extensions of the row, column, and value of the non-zero entries in \({M}\) according to some canonical order over \({K}\).Footnote 9

This method of representing the low-degree extension of \({M}\) suggests an idea: let the verifier have oracle access to the polynomials \(\hat{{\mathsf {row}}}_{M},\hat{{\mathsf {col}}}_{M},\hat{{\mathsf {val}}}_{M}\) and do yet another univariate sumcheck, but this time over the set \({K}\). The verifier sends \(\beta _2\) to the prover, who computes

$$\begin{aligned} \sigma _3 := \hat{{M}}(\beta _2, \beta ) = \sum _{{\kappa }\in {K}} u_{{H}}(\beta _2, \hat{{\mathsf {row}}}_{M}({\kappa }))\cdot u_{{H}}(\beta , \hat{{\mathsf {col}}}_{M}({\kappa }))\cdot \hat{{\mathsf {val}}}_{M}({\kappa }). \end{aligned}$$

Then the prover replies with \(\sigma _3\) and the polynomials \(h_3({X})\) and \(g_3({X})\) such that

$$\begin{aligned} u_{{H}}(\beta _2, \hat{{\mathsf {row}}}_{M}({X})) u_{{H}}(\beta , \hat{{\mathsf {col}}}_{M}({X})) \hat{{\mathsf {val}}}_{M}({X}) = h_3({X})v_{{K}}({X}) + {X}g_3({X}) + \sigma _3/{m}. \end{aligned}$$

The verifier can then check this equation at a random \(\beta _3 \in \mathbb {F}\), which only requires \(O(\log {m})\) operations.

The above idea almost works; the one remaining problem is that \(h_3({X})\) has degree \(\varOmega ({n}{m})\) (because the left-hand size of the equation has quadratic degree), which is too expensive for our target of a quasilinear-time prover. We overcome this problem by letting the prover run the univariate sumcheck protocol on the unique low-degree extension \(\hat{f}({X})\) of the function \(f :{K}\rightarrow \mathbb {F}\) defined as \(f({\kappa }):=u_{{H}}(\beta _2, \hat{{\mathsf {row}}}_{M}({\kappa })) u_{{H}}(\beta , \hat{{\mathsf {col}}}_{M}({\kappa })) \hat{{\mathsf {val}}}_{M}({\kappa })\). Observe that \(\hat{f}({X})\) has degree less than \({m}\). The verifier checks that \(\hat{f}({X})\) and \(u_{{H}}(\beta _2, \hat{{\mathsf {row}}}_{M}({X})) u_{{H}}(\beta , \hat{{\mathsf {col}}}_{M}({X})) \hat{{\mathsf {val}}}_{M}({X})\) agree on \({K}\).

From sketch to protocol. In the above discussion we have ignored a number of technical aspects, such as proof of knowledge and zero knowledge (which are ultimately needed in the compiler if we want to construct a preprocessing zkSNARK). We have also not discussed time complexities of many algebraic steps, and we omitted discussion of how to batch multiple sumchecks into fewer ones, which brings important savings in argument size. For details, see our detailed construction in Sect. 5.

2.5 Construction: Extractable Polynomial Commitments

We now sketch how to construct a polynomial commitment scheme that achieves the strong functionality and security requirements of our definition in Sect. 2.2. Our starting point is the \(\mathsf {PolyCommit}_\text {DL}\) construction of Kate et al. [KZG10], and then describe a sequence of natural and generic transformations that extend this construction to enable extractability, commitments to multiple polynomials, and the enforcement of per-polynomial degree bounds. In fact, once we arrive at a scheme that supports extractability for committed polynomials at a single point, our transformations build on this construction in a black box way to first support per-polynomial degree bounds, and then query sets that may request multiple evaluation points per polynomial. See the full version for details of these transformations.

Starting point: . The setup phase samples a cryptographically secure bilinear group \((\mathbb {G}_1, \mathbb {G}_2, \mathbb {G}_T, q, G, H, e)\) and then samples a committer key \(\mathsf {ck}\) and receiver key \(\mathsf {rk}\) for a given degree bound \(D\). The committer key consists of group elements encoding powers of a random field element \(\beta \), namely, \(\mathsf {ck}:=\{G, \beta G, \dotsc , \beta ^DG\} \in \mathbb {G}_1^{D+ 1}\). The receiver key consists of the group elements \(\mathsf {rk}:= (G, H, \beta H) \in \mathbb {G}_1\times \mathbb {G}_2^{2}\). Note that the SRS, which consists of the keys \(\mathsf {ck}\) and \(\mathsf {rk}\), is updatable because the coefficients of group elements in the SRS are all monomials.

To commit to a polynomial \(p\in \mathbb {F}_{q}[X]\), the sender computes \(c:=p(\beta )G\). To subsequently prove that the committed polynomial evaluates to \(v\) at a point \(z\), the sender computes a witness polynomial \(w(X) :=(p(X) - p(z))/(X- z)\), and provides as proof a commitment to \(w\): \(\pi :=w(\beta )G\). The idea is that the witness function \(w\) is a polynomial if and only if \(p(z) = v\); otherwise, it is a rational function, and cannot be committed to using \(\mathsf {ck}\).

Finally, to verify a proof of evaluation, the receiver checks that the commitment and proof of evaluation are consistent. That is, it checks that the proof commits to a polynomial of the form \((p(X) - p(z))/(X- z)\) by checking the equality \(e(c-vG, H) = e(\pi , \beta H- zH)\).

Achieving extractability. While the foregoing construction guarantees correctness of evaluations, it does not by itself guarantee that a commitment actually “contains” a suitable polynomial of degree at most \(D\). We study two methods to address this issue, and thereby achieve extractability. One method is to modify the construction to use knowledge commitments [Gro10], and rely on a concrete knowledge assumption. The main disadvantage of this approach is that each commitment doubles in size. The other method is to move away from the plain model, and instead conduct the security analysis in the algebraic group model (AGM) [FKL18]. This latter method is more efficient because each commitment remains a single group element.

Committing to multiple polynomials at once. We enable the sender to simultaneously open multiple polynomials \([p_{i}]_{i=1}^{n}\) at the same point \(z\) as follows. Before generating a proof of evaluation for \([p_{i}]_{i=1}^{n}\), the sender requests from the receiver a random field element \(\xi \), which he uses to take a random linear combination of the polynomials: \(p:=\sum _{i=1}^n \xi ^i p_i\), and generates a proof of evaluation \(\pi \) for this polynomial \(p\).

The receiver verifies \(\pi \) by using the fact that the commitments are additively homomorphic. The receiver takes a linear combination of the commitments and claimed evaluations, obtaining the combined commitment \(c= \sum _{i=1}^n \xi ^i c_i\) and evaluation \(v= \sum _{i=1}^n \xi ^i v_i\). Finally, it checks the pairing equations for \(c\), \(\pi \), and \(v\).

Completeness of this check is straightforward, while soundness follows from the fact that if any polynomial does not match its evaluation, then the combined polynomial will not match its evaluation with high probability.

Enforcing multiple degree bounds. The construction so far enforces a single bound \(D\) on the degrees of all the polynomials \(p_i\). To enforce a different degree bound \(d_i\) for each \(p_i\), we require the sender to commit not only to each \(p_i\), but also to “shifted polynomials” \(p'_i(X) :=X^{D- d_i}p_i(X)\). The proof of evaluation proves that, if \(p_i\) evaluates to \(v_i\) at \(z\), then \(p'_i\) evaluates to \(z^{D-d_i}v_i\).

The receiver checks that the commitment for each \(p'_i\) corresponds to an evaluation \(z^{D-d_i}v_i\) so that, if \(z\) is sampled from a super-polynomial subset of \(\mathbb {F}_{q}\), the probability that \(\mathrm {deg}(p_i) \ne d_i\) is negligible. This trick is similar to the one used in [BS08, Ben+19a] to derive low-degree tests for specific degree bounds.

However, while sound, this approach is inefficient in our setting: the witness polynomial for \(p'_i\) has \(\varOmega (D)\) non-zero coefficients (instead of \(O(d_i)\)), and so constructing an evaluation proof for it requires \(\varOmega (D)\) scalar multiplications (instead of \(O(d_i)\)). To work around this, we instead produce a proof that the related polynomial \(p_i^\star (X) :=p'_i(X) - p_i(z)X^{D- d_i}\) evaluates to 0 at \(z\). As we show in the full version, the witness polynomial for this claim has \(O(d_i)\) non-zero coefficients, and so constructing the evaluation proof can be done in \(O(d_i)\) scalar multiplications. Completeness is preserved because the receiver can check the correct evaluation of \(p_i^\star \) by subtracting \(p_i(z)(\beta ^{D- d_i}\mathbb {G})\) from the commitment to the shifted polynomial \(p'_i\), thereby obtaining a commitment to \(p_i^\star \), while security is preserved because \(p'_i(z) = z^{D-d_i}v_i \iff p_i^\star (z) = 0\).

Evaluating at a query set instead of a single point. To support the case where the polynomials \([p_{i}]_{i=1}^{n}\) are evaluated at a set of points \(Q\), the sender proceeds as follows. Say that there are k different points \([z_i]_{i=1}^k\) in \(Q\). The sender partitions the polynomials \([p_{i}]_{i=1}^{n}\) into different groups such that every polynomial in a group is to be evaluated at the same point \(z_i\). The sender runs on each group, and outputs the resulting list of evaluation proofs.

Achieving hiding. To additionally achieve hiding, we follow the above blueprint, replacing \(\mathsf {PolyCommit}_\text {DL}\) with the hiding scheme \(\mathsf {PolyCommit}_\text {Ped}\) described in [KZG10].

3 Preliminaries

We denote by [n] the set . We use \(\varvec{a} = [a_{i}]_{i=1}^{n}\) as a short-hand for the tuple \((a_1,\dots ,a_n)\), and \([\varvec{a}_i]_{i=1}^{n} = [[a_{i,j}]_{j=1}^{m}]_{i=1}^{n}\) as a short-hand for the tuple \((a_{1,1}, \dotsc , a_{1,m}, \dotsc , a_{n,1},\dots ,a_{n,m})\); \(|\varvec{a}|\) denotes the number of entries in \(\varvec{a}\). If x is a binary string then |x| denotes its bit length. If M is a matrix then \(\Vert M \Vert \) denotes the number of nonzero entries in M. If S is a finite set then |S| denotes its cardinality and \(x \leftarrow S\) denotes that x is an element sampled at random from S. We denote by \(\mathbb {F}\) a finite field, and whenever \(\mathbb {F}\) is an input to an algorithm we implicitly assume that \(\mathbb {F}\) is represented in a way that allows efficient field arithmetic. Given a finite set S, we denote by \(\mathbb {F}^{S}\) the set of vectors indexed by elements in S. We denote by \(\mathbb {F}[{X}]\) the ring of univariate polynomials over \(\mathbb {F}\) in \({X}\), and by \(\mathbb {F}^{<d}[{X}]\) the set of polynomials in \(\mathbb {F}[{X}]\) with degree less than d.

We denote by \(\lambda \in \mathbb {N}\) a security parameter. When we state that \(n \in \mathbb {N}\) for some variable n, we implicitly assume that \(n = \mathrm {poly}(\lambda )\). We denote by \(\mathrm {negl}(\lambda )\) an unspecified function that is negligible in \(\lambda \) (namely, a function that vanishes faster than the inverse of any polynomial in \(\lambda \)). When a function can be expressed in the form \(1-\mathrm {negl}(\lambda )\), we say that it is overwhelming in \(\lambda \). When we say that \(\mathcal {A}\) is an efficient adversary we mean that \(\mathcal {A}\) is a family \(\{\mathcal {A}_{\lambda }\}_{\lambda \in \mathbb {N}}\) of non-uniform polynomial-size circuits. If the adversary consists of multiple circuit families \(\mathcal {A}_1,\mathcal {A}_2,\dots \) then we write \(\mathcal {A}=(\mathcal {A}_1,\mathcal {A}_2,\dots )\).

Given two interactive algorithms A and B, we denote by \(\langle A(x),B(y)\rangle (z)\) the output of B(yz) when interacting with A(xz). Note that this output could be a random variable. If we use this notation when A or B is a circuit, we mean that we are considering a circuit that implements a suitable next-message function to interact with the other party of the interaction.

3.1 Indexed Relations

An indexed relation \(\mathcal {R}\) is a set of triples where is the index, \(\mathbbm {x}\) is the instance, and \(\mathbbm {w}\) is the witness; the corresponding indexed language \(\mathcal {L}(\mathcal {R})\) is the set of pairs for which there exists a witness \(\mathbbm {w}\) such that . For example, the indexed relation of satisfiable boolean circuits consists of triples where is the description of a boolean circuit, \(\mathbbm {x}\) is a partial assignment to its input wires, and \(\mathbbm {w}\) is an assignment to the remaining wires that makes the circuit to output 0. Given a size bound \(\mathsf {N}\in \mathbb {N}\), we denote by \(\mathcal {R}_{\mathsf {N}}\) the restriction of \(\mathcal {R}\) to triples with .

4 Algebraic Holographic Proofs

We define algebraic holographic proofs (AHPs), the notion of proofs that we use. For simplicity, the formal definition below is tailored to univariate polynomials, because our AHP construction is in this setting. The definition can be modified in a straightforward way to consider the general case of multivariate polynomials.

We represent polynomials through the coefficients that define them, as opposed to through their evaluation over a sufficiently large domain (as is typically the case in probabilistic proofs). This definitional choice is due to the fact that we will consider verifiers that may query the polynomials at any location in the field of definition. Moreover, the field of definition itself can be chosen from a given field family, and so we make the field an additional input to all algorithms; this degree of freedom is necessary when combining this component with polynomial commitment schemes. Finally, we consider the setting of indexed relations (see Sect. 3.1), where the verifier’s input has two parts, the index and the instance; in the definition below, the verifier receives the index encoded and the instance explicitly.

Formally, an algebraic holographic proof (AHP) over a field family \(\mathcal {F}\) for an indexed relation \(\mathcal {R}\) is specified by a tuple

$$\begin{aligned} \mathsf {AHP}= (\mathsf {k}, \mathsf {s}, \mathsf {d}, \mathbf {I}, \mathbf {P}, \mathbf {V})\end{aligned}$$

where \(\mathsf {k}, \mathsf {s}, \mathsf {d}:\{0,1\}^{*} \rightarrow \mathbb {N}\) are polynomial-time computable functions and \(\mathbf {I},\mathbf {P},\mathbf {V}\) are three algorithms known as the indexer, prover, and verifier. The parameter \(\mathsf {k}\) specifies the number of interaction rounds, \(\mathsf {s}\) specifies the number of polynomials in each round, and \(\mathsf {d}\) specifies degree bounds on these polynomials.

In the offline phase (“0-th round”), the indexer \(\mathbf {I}\) receives as input a field \(\mathbb {F}\in \mathcal {F}\) and an index for \(\mathcal {R}\), and outputs \(\mathsf {s}(0)\) polynomials \(p_{0,1},\dots ,p_{0,\mathsf {s}(0)} \in \mathbb {F}[{X}]\) of degrees at most respectively. Note that the offline phase does not depend on any particular instance or witness, and merely considers the task of encoding the given index .

In the online phase, given an instance \(\mathbbm {x}\) and witness \(\mathbbm {w}\) such that , the prover \(\mathbf {P}\) receives and the verifier \(\mathbf {V}\) receives \((\mathbb {F},\mathbbm {x})\) and oracle access to the polynomials output by . The prover \(\mathbf {P}\) and the verifier \(\mathbf {V}\) interact over rounds.

For \(i \in [\mathsf {k}]\), in the i-th round of interaction, the verifier \(\mathbf {V}\) sends a message \(\rho _i \in \mathbb {F}^{*}\) to the prover \(\mathbf {P}\); then the prover \(\mathbf {P}\) replies with \(\mathsf {s}(i)\) oracle polynomials \(p_{i,1},\dots ,p_{i,\mathsf {s}(i)}\in \mathbb {F}[{X}]\). The verifier may query any of the polynomials it has received any number of times. A query consists of a location \(z\in \mathbb {F}\) for an oracle \(p_{i,j}\), and its corresponding answer is \(p_{i,j}(z) \in \mathbb {F}\). After the interaction, the verifier accepts or rejects.

The function \(\mathsf {d}\) determines which provers to consider for the completeness and soundness properties of the proof system. In more detail, we say that a (possibly malicious) prover \(\tilde{\mathbf {P}}\) is admissible for \(\mathsf {AHP}\) if, on every interaction with the verifier \(\mathbf {V}\), it holds that for every round \(i\in [\mathsf {k}]\) and oracle index \(j\in [\mathsf {s}(i)]\) we have . The honest prover \(\mathbf {P}\) is required to be admissible under this definition.

We say that \(\mathsf {AHP}\) has perfect completeness and soundness error \(\epsilon \) if the following holds.

  • Completeness. For every field \(\mathbb {F}\in \mathcal {F}\) and index-instance-witness tuple , the probability that convinces to accept in the interactive oracle protocol is 1.

  • Soundness. For every field \(\mathbb {F}\in \mathcal {F}\), index-instance pair , and admissible prover \(\tilde{\mathbf {P}}\), the probability that \(\tilde{\mathbf {P}}\) convinces to accept in the interactive oracle protocol is at most \(\epsilon \).

The proof length \(\mathsf {l}\) is the sum of all degree bounds in the offline and online phases, . The intuition for this definition is that in a probabilistic proof each oracle would consist of the evaluation of a polynomial over a domain whose size (in field elements) is linearly related to its degree bound, so that the resulting proof length would be linearly related to the sum of all degree bounds.

The query complexity \(q\) is the total number of queries made by the verifier to the polynomials. This includes queries to the polynomials output by the indexer and those sent by the prover.

All AHPs that we construct achieve the stronger property of knowledge soundness (against admissible provers), and optionally also zero knowledge. We define both of these properties below.

Knowledge soundness. We say that \(\mathsf {AHP}\) has knowledge error \(\epsilon \) if there exists a probabilistic polynomial-time extractor \(\mathbf {E}\) for which the following holds. For every field \(\mathbb {F}\in \mathcal {F}\), index , instance \(\mathbbm {x}\), and admissible prover \(\tilde{\mathbf {P}}\), the probability that outputs \(\mathbbm {w}\) such that is at least the probability that \(\tilde{\mathbf {P}}\) convinces to accept minus \(\epsilon \). Here the notation \(\mathbf {E}^{\tilde{\mathbf {P}}}\) means that the extractor \(\mathbf {E}\) has black-box access to each of the next-message functions that define the interactive algorithm \(\tilde{\mathbf {P}}\). (In particular, the extractor \(\mathbf {E}\) can “rewind” the prover \(\tilde{\mathbf {P}}\).) Note that since \(\mathbf {E}\) receives the proof length in unary, \(\mathbf {E}\) has enough time to receive, and perform efficient computations on, polynomials output by \(\tilde{\mathbf {P}}\).

Zero knowledge. We say that \(\mathsf {AHP}\) has (perfect) zero knowledge with query bound \(\mathsf {b}\) and query checker \(\mathbf {C}\) if there exists a probabilistic polynomial-time simulator \(\mathbf {S}\) such that for every field \(\mathbb {F}\in \mathcal {F}\), index-instance-witness tuple , and \((\mathsf {b},\mathbf {C})\)-query algorithm \(\tilde{\mathbf {V}}\) the random variables and , defined below, are identical. Here, we say that an algorithm is \((\mathsf {b},\mathbf {C})\)-query if it makes at most \(\mathsf {b}\) queries to oracles it has access to, and each query individually leads the checker \(\mathbf {C}\) to output “ok”.

  • is the view of \(\tilde{\mathbf {V}}\), namely, is the random variable \((r,a_1,\dots ,a_q)\) where r is \(\tilde{\mathbf {V}}\)’s randomness and \(a_1,\dots ,a_q\) are the responses to \(\tilde{\mathbf {V}}\)’s queries determined by the oracles sent by .

  • is the output of when given straightline access to \(\tilde{\mathbf {V}}\) (\(\mathbf {S}\) may interact with \(\tilde{\mathbf {V}}\), without rewinding, by exchanging messages with \(\tilde{\mathbf {V}}\) and answering any oracle queries along the way), prepended with \(\tilde{\mathbf {V}}\)’s randomness r. Note that r could be of super-polynomial size, so \(\mathbf {S}\) cannot sample r on \(\tilde{\mathbf {V}}\)’s behalf and then output it; instead, as in prior work, we restrict \(\mathbf {S}\) to not see r, and prepend r to \(\mathbf {S}\)’s output.

A special case of interest. We only consider AHPs that satisfy the following properties.

  • Public coins: \(\mathsf {AHP}\) is public-coin if each verifier message to the prover is a uniformly random string of some prescribed length (or an empty string). All verifier queries can be postponed, without loss of generality, to a query phase that occurs after the interactive phase with the prover.

  • Non-adaptive queries: \(\mathsf {AHP}\) is non-adaptive if all of the verifier’s query locations are solely determined by the verifier’s randomness and inputs (the field \(\mathbb {F}\) and the instance \(\mathbbm {x}\)).

Given these properties, we can view the verifier as two subroutines that execute in the query phase: a query algorithm \(\mathbf {Q}_{\mathbf {V}}\) that produces query locations based on the verifier’s randomness, and a decision algorithm \(\mathbf {D}_{\mathbf {V}}\) that accepts or rejects based on the answers to the queries (and the verifier’s randomness). In more detail, \(\mathbf {Q}_{\mathbf {V}}\) receives as input the field \(\mathbb {F}\), the instance \(\mathbbm {x}\), and randomness \(\rho _{1},\dots ,\rho _{\mathsf {k}},\rho _{\mathsf {k}+1}\), and outputs a query set \(Q\) consisting of tuples \(((i,j),z)\) to be interpreted as “query \(p_{i,j}\) at \(z\in \mathbb {F}\)”; and \(\mathbf {D}_{\mathbf {V}}\) receives as input the field \(\mathbb {F}\), the instance \(\mathbbm {x}\), answers \((v_{((i,j),z)})_{((i,j),z) \in Q}\), and randomness \(\rho _{1},\dots ,\rho _{\mathsf {k}},\rho _{\mathsf {k}+1}\), and outputs the decision bit.

While the above properties are not strictly necessary for the compiler that we describe in the full version, all “natural” protocols that we are aware of (including those that we construct in this paper) satisfy these properties, and so we restrict our attention to public-coin non-adaptive protocols for simplicity.

5 AHP for Constraint Systems

We construct an AHP for rank-1 constraint satisfiability (R1CS) that has linear proof length and constant query complexity. Below we define the indexed relation that represents this problem, and then state our result.

Definition 1 (R1CS indexed relation). The indexed relation \(\mathcal {R}_{\mathrm {R1CS}}\) is the set of all triples

where \(\mathbb {F}\) is a finite field, \({H}\) and \({K}\) are subsets of \(\mathbb {F}\), \({A},{B},{C}\) are \({H}\times {H}\) matrices over \(\mathbb {F}\) with \(|{K}| \ge \max \{\Vert {A} \Vert ,\Vert {B} \Vert ,\Vert {C} \Vert \}\), and \({z}:= (x, w)\) is a vector in \(\mathbb {F}^{{H}}\) such that \({A}{z}\circ {B}{z}= {C}{z}\).

Theorem 1. There exists an AHP for the indexed relation \(\mathcal {R}_{\mathrm {R1CS}}\) that is a zero knowledge proof of knowledge with the following features. The indexer uses \(O(|{K}| \log |{K}|)\) field operations and outputs \(O(|{K}|)\) field elements. The prover and verifier exchange 7 messages. To achieve zero knowledge against \(\mathsf {b}\) queries (with a query checker \(\mathbf {C}\) that rejects queries in \({H}\)), the prover uses \(O((|{K}|+\mathsf {b}) \log (|{K}|+\mathsf {b}))\) field operations and outputs a total of \(O(|{H}|+\mathsf {b})\) field elements. The verifier makes O(1) queries to the encoded index and to the prover’s messages, has soundness error \(O((|{K}|+\mathsf {b})/|\mathbb {F}|)\), and uses \(O(|x| + \log |{K}|)\) field operations.

Remark 1

(restrictions on domains). Our protocol uses the univariate sumcheck of [Ben+19a] as a subroutine, and in particular inherits the requirement that the domains \({H}\) and \({K}\) must be additive or multiplicative subgroups of the field \(\mathbb {F}\). For simplicity, in our descriptions we use multiplicative subgroups because we use this case in our implementation; the case of additive subgroups involves only minor modifications. Moreover, the arithmetic complexities for the indexer and prover stated in Theorem 1 assume that the domains \({H}\) and \({K}\) are “FFT-friendly” (e.g., they have smooth sizes); this is not a requirement, since in general the arithmetic complexities will be that of an FFT over the domains \({H}\) and \({K}\). Note that we can assume without loss of generality that \(|{H}| = O(|{K}|)\), for otherwise (if \(|{K}| < |{H}|/3\)) then are empty rows or columns across the matrices that we can drop and reduce their size. Finally, we assume that \(|{H}| \le |\mathbb {F}|/2\).

This section is organized as follows: in Sect. 5.1 we introduce algebraic notations and facts used in this section, and then in Sect. 5.2 we describe an AHP for checking linear relations. Due to space constraints, we describe how to use this latter AHP to construct our AHP for R1CS only in the full version.

Throughout we assume that \({H}\) and \({K}\) come equipped with bijections \({\phi }_{_{{H}}} :{H}\rightarrow [|{H}|]\) and \({\phi }_{_{{K}}} :{K}\rightarrow [|{K}|]\) that are computable in linear time. Moreover, we define the two sets \({H}[\le k] := \{{\kappa }\in {H}:1 \le {\phi }_{_{{H}}}({\kappa }) \le k\}\) and \({H}[> k] := \{{\kappa }\in {H}:{\phi }_{_{{H}}}({\kappa }) > k\}\) to denote the first \(k\) elements in \({H}\) and the remaining elements, respectively. We can then write that \(x\in \mathbb {F}^{{H}[\le |x|]}\) and \(w\in \mathbb {F}^{{H}[> |x|]}\).

5.1 Algebraic Preliminaries

Polynomial encodings. For a finite field \(\mathbb {F}\), subset \(S\subseteq \mathbb {F}\), and function \(f :S\rightarrow \mathbb {F}\) we denote by \(\hat{f}\) the (unique) univariate polynomial over \(\mathbb {F}\) with degree less than \(|S|\) such that \(\hat{f}(a)=f(a)\) for every \(a \in S\). We sometimes abuse notation and write \(\hat{f}\) to denote some polynomial that agrees with f on \(S\), which need not equal the (unique) such polynomial of smallest degree.

Vanishing polynomials. For a finite field \(\mathbb {F}\) and subset \(S\subseteq \mathbb {F}\), we denote by \(v_{S}\) the unique non-zero monic polynomial of degree at most \(|S|\) that is zero everywhere on \(S\); \(v_{S}\) is called the vanishing polynomial of \(S\). If \(S\) is an additive or multiplicative coset in \(\mathbb {F}\) then \(v_{S}\) can be evaluated in \(\mathrm {polylog}(|S|)\) field operations. For example, if \(S\) is a multiplicative subgroup of \(\mathbb {F}\) then \(v_{S}({X}) = {X}^{|S|}-1\) and, more generally, if \(S\) is a \(\xi \)-coset of a multiplicative subgroup \(S_{0}\) (namely, \(S= \xi S_{0}\)) then \(v_{S}({X}) = \xi ^{|S|}v_{S_{0}}({X}/\xi ) = {X}^{|S|}-\xi ^{|S|}\); in either case, \(v_{S}\) can be evaluated in \(O(\log |S|)\) field operations.

Derivative of vanishing polynomials. We rely on various properties of a bivariate polynomial \(u_{S}\) introduced in [Ben+19b]. For a finite field \(\mathbb {F}\) and subset \(S\subseteq \mathbb {F}\), we define

$$\begin{aligned} u_{S}({X},{Y}) := \frac{v_{S}({X}) - v_{S}({Y})}{{X}- {Y}}, \end{aligned}$$

which is a polynomial of individual degree \(|S|-1\) because \({X}-{Y}\) divides \({X}^i-{Y}^i\) for any positive integer i. Note that \(u_{S}({X},{X})\) is the formal derivative of the vanishing polynomial \(v_{S}({X})\). The bivariate polynomial \(u_{S}({X},{Y})\) satisfies two useful algebraic properties. First, the univariate polynomials \((u_{S}({X},a))_{a \in S}\) are linearly independent, and \(u_{S}({X},{Y})\) is their (unique) low-degree extension. Second, \(u_{S}({X},{Y})\) vanishes on the square \(S\times S\) except for on the diagonal, where it takes on the (non-zero) values \((u_{S}(a,a))_{a \in S}\).

If \(S\) is an additive or multiplicative coset in \(\mathbb {F}\), \(u_{S}({X},{Y})\) can be evaluated at any \((\alpha ,\beta ) \in \mathbb {F}^2\) in \(\mathrm {polylog}(|S|)\) field operations because in this case both \(v_{S}\) (and its derivative) can be evaluated in \(\mathrm {polylog}(|S|)\) field operations. For example, if \(S\) is a multiplicative subgroup then \(u_{S}({X},{Y}) = ({X}^{|S|}-{Y}^{|S|})/({X}-{Y})\) and \(u_{S}({X},{X}) = |S| {X}^{|S|-1}\), so both can be evaluated in \(O(\log |S|)\) field operations.

Univariate sumcheck for subgroups. Prior work [Ben+19a] shows that, given a multiplicative subgroup S of \(\mathbb {F}\), a polynomial \(f({X})\) sums to \(\sigma \) over \(S\) if and only if \(f({X})\) can be written as \(h({X})v_{S}({X}) + {X}g({X}) + \sigma /|S|\) for some \(h({X})\) and \(g({X})\) with \(\mathrm {deg}(g) < |S|-1\). This can be viewed as a univariate sumcheck protocol, and we shall rely on it throughout this section.

5.2 AHP for the Lincheck Problem

The lincheck problem for univariate polynomials considers the task of deciding whether two polynomials encode vectors that are linearly related in a prescribed way. In more detail, the problem is parametrized by a field \(\mathbb {F}\), two subsets \({H}\) and \({K}\) of \(\mathbb {F}\), and a matrix \({M}\in \mathbb {F}^{{H}\times {H}}\) with \(|{K}| \ge \Vert {M} \Vert >0\). Given oracle access to two low-degree polynomials \(f_1, f_2\in \mathbb {F}^{< d}[{X}]\), the problem asks to decide whether for every \(a \in {H}\) it holds that \(f_1(a) = \sum _{b \in {H}} {M}_{a,b} \cdot f_2(b)\), by asking a small number of queries to \(f_1\) and \(f_2\). The matrix \({M}\) thus prescribes the linear relations that relate the values of \(f_1\) and \(f_2\) on \({H}\).

Ben-Sasson et al. [Ben+19a] solve this problem by reducing the lincheck problem to a sumcheck problem, and then reducing the sumcheck problem to low-degree testing (of univariate polynomials). In particular, this prior work achieves a 2-message algebraic non-holographic protocol that solves the lincheck problem with linear proof length and constant query complexity. In this section we show how to achieve a 6-message algebraic holographic protocol, again with linear proof length and constant query complexity. In Sect. 5.2.1 we describe the indexer algorithm, in Sect. 5.2.2 we describe the prover and verifier algorithms. In the full version we provide a diagram that summarizes the protocol, and provide completeness, soundness, and efficiency analyses.

5.2.1 Offline Phase: Encoding the Linear Relation

The indexer \(\mathbf {I}\) for the lincheck problem receives as input a field \(\mathbb {F}\), two subsets \({H}\) and \({K}\) of \(\mathbb {F}\), and a matrix \({M}\in \mathbb {F}^{{H}\times {H}}\) with \(|{K}| \ge \Vert {M} \Vert \). The non-zero entries of \({M}\) are assumed to be presented in some canonical order (e.g., row-wise or column-wise). The output of \(\mathbf {I}\) is three univariate polynomials \(\hat{{\mathsf {row}}},\hat{{\mathsf {col}}},\hat{{\mathsf {val}}}\) over \(\mathbb {F}\) of degree less than \(|{K}|\) such that the following polynomial is a low-degree extension of \({M}\):

$$\begin{aligned} \hat{{M}}({X}, {Y}) := \sum _{{\kappa }\in {K}} u_{{H}}({X}, \hat{{\mathsf {row}}}({\kappa })) u_{{H}}({Y}, \hat{{\mathsf {col}}}({\kappa })) \hat{{\mathsf {val}}}({\kappa }). \end{aligned}$$
(1)

The three aforementioned polynomials are the (unique) low-degree extensions of the three functions \({\mathsf {row}},{\mathsf {col}},{\mathsf {val}}:{K}\rightarrow \mathbb {F}\) that respectively represent the row index, column index, and value of the non-zero entries of the matrix \({M}\). In more detail, for every \({\kappa }\in {K}\) with \(1 \le {\phi }_{_{{K}}}({\kappa }) \le \Vert {M} \Vert \):

  • \({\mathsf {row}}({\kappa }) := {\phi }_{_{{H}}}^{-1}({t}_{{\kappa }})\) where \({t}_{{\kappa }}\) is the row index of the \({\phi }_{_{{K}}}({\kappa })\)-th nonzero entry in \({M}\);

  • \({\mathsf {col}}({\kappa }) := {\phi }_{_{{H}}}^{-1}({t}_{{\kappa }})\) where \({t}_{{\kappa }}\) is the column index of the \({\phi }_{_{{K}}}({\kappa })\)-th nonzero entry in \({M}\);

  • \({\mathsf {val}}({\kappa })\) is the value of the \({\phi }_{_{{K}}}({\kappa })\)-th nonzero entry in \({M}\), divided by \(u_{{H}}({\mathsf {row}}({\kappa }), {\mathsf {row}}({\kappa })) u_{{H}}({\mathsf {col}}({\kappa }), {\mathsf {col}}({\kappa }))\).

Also, \({\mathsf {val}}({\kappa })\) returns the element 0 for every \({\kappa }\in {K}\) with \({\phi }_{_{{K}}}({\kappa }) > \Vert {M} \Vert \), while \({\mathsf {row}}({\kappa })\) and \({\mathsf {col}}({\kappa })\) return an arbitrary element in \({H}\) for such \({\kappa }\). The evaluation tables of these functions can be found in \(O(|{K}| \log |{H}|)\) operations, from which interpolation yields the desired polynomials in \(O(|{K}| \log |{K}|)\) operations.

Recall from Sect. 5.1 that the bivariate polynomial \(u_{{H}}({X},{Y})\) vanishes on the square \({H}\times {H}\) except for on the diagonal, where it takes on the (non-zero) values \((u_{{H}}(a,a))_{a \in {H}}\). By construction of the polynomials \(\hat{{\mathsf {row}}},\hat{{\mathsf {col}}},\hat{{\mathsf {val}}}\), the polynomial \(\hat{{M}}({X}, {Y})\) agrees with the matrix \({M}\) everywhere on the domain \({H}\times {H}\). The individual degree of \(\hat{{M}}({X}, {Y})\) is less than \(|{H}|\). Thus, \(\hat{{M}}\) is the unique low-degree extension of \({M}\).

We rewrite the polynomial \(\hat{{M}}({X}, {Y})\) in a form that will be useful later:

Claim 1

$$\begin{aligned} \hat{{M}}({X}, {Y}) = \sum _{{\kappa }\in {K}} \frac{v_{{H}}({X})}{({X}- \hat{{\mathsf {row}}}({\kappa }))} \cdot \frac{v_{{H}}({Y})}{({Y}- \hat{{\mathsf {col}}}({\kappa }))} \cdot \hat{{\mathsf {val}}}({\kappa }) . \end{aligned}$$
(2)

Proof

Note that \(v_{{H}}(\hat{{\mathsf {row}}}({\kappa })) = v_{{H}}(\hat{{\mathsf {col}}}({\kappa })) = 0\) for every \({\kappa }\in {K}\) because \(\hat{{\mathsf {row}}}({X})\) and \(\hat{{\mathsf {col}}}({X})\) map \({K}\) to \({H}\) and \(v_{{H}}\) vanishes on \({H}\). Therefore:

$$\begin{aligned} \hat{{M}}({X}, {Y})&= \sum _{{\kappa }\in {K}} u_{{H}}({X}, \hat{{\mathsf {row}}}({\kappa })) \cdot u_{{H}}({Y}, \hat{{\mathsf {col}}}({\kappa })) \cdot \hat{{\mathsf {val}}}({\kappa }) \\&= \sum _{{\kappa }\in {K}} \frac{v_{{H}}({X}) - v_{{H}}(\hat{{\mathsf {row}}}({\kappa }))}{{X}- \hat{{\mathsf {row}}}({\kappa })} \cdot \frac{v_{{H}}({Y}) - v_{{H}}(\hat{{\mathsf {col}}}({\kappa }))}{{Y}- \hat{{\mathsf {col}}}({\kappa })} \cdot \hat{{\mathsf {val}}}({\kappa }) \\&= \sum _{{\kappa }\in {K}} \frac{v_{{H}}({X})}{({X}- \hat{{\mathsf {row}}}({\kappa }))} \cdot \frac{v_{{H}}({Y})}{({Y}- \hat{{\mathsf {col}}}({\kappa }))} \cdot \hat{{\mathsf {val}}}({\kappa }). \end{aligned}$$

   \(\square \)

5.2.2 Online Phase: Proving and Verifying the Linear Relation

The prover \(\mathbf {P}\) for the lincheck problem receives as input a field \(\mathbb {F}\), two subsets \({H}\) and \({K}\) of \(\mathbb {F}\), a matrix \({M}\in \mathbb {F}^{{H}\times {H}}\) with \(|{K}| \ge \Vert {M} \Vert \), and two polynomials \(f_1, f_2\in \mathbb {F}^{< d}[{X}]\). The verifier \(\mathbf {V}\) for the lincheck problem receives as input the field \(\mathbb {F}\) and two subsets \({H}\) and \({K}\) of \(\mathbb {F}\); \(\mathbf {V}\) also has oracle access to the polynomials \(\hat{{\mathsf {row}}},\hat{{\mathsf {col}}},\hat{{\mathsf {val}}}\) output by the indexer \(\mathbf {I}\) invoked on appropriate inputs.

The protocol begins with a reduction from a lincheck problem to a sumcheck problem: \(\mathbf {V}\) samples a random element \(\alpha \in \mathbb {F}\) and sends it to \(\mathbf {P}\). Indeed, letting \({r}({X}, {Y})\) denote the polynomial \(u_{{H}}({X},{Y})\), \(\mathbf {P}\) is left to convince \(\mathbf {V}\) that the following univariate polynomial sums to 0 on \({H}\):

$$\begin{aligned} q_1({X}) := {r}(\alpha , {X}) f_1({X}) - {r}_{{M}}(\alpha , {X}) f_2({X}) \end{aligned}$$
(3)

where \({r}_{{M}}({X}, {Y}) := \sum _{{\kappa }\in {H}} {r}({X}, {\kappa }) \hat{{M}}({\kappa }, {Y})\).

We rely on the univariate sumcheck protocol for this step: \(\mathbf {P}\) sends to \(\mathbf {V}\) the polynomials \(g_1({X})\) and \(h_1({X})\) such that \(q_1({X}) = h_1({X}) v_{{H}}({X}) + {X}g_1({X})\). In order to check this polynomial identity, \(\mathbf {V}\) samples a random element \(\beta _1\in \mathbb {F}\) with the intention of checking the identity at \({X}:= \beta _1\). For the right-hand side, \(\mathbf {V}\) queries \(g_1\) and \(h_1\) at \(\beta _1\), and then evaluates \(h_1(\beta _1) v_{{H}}(\beta _1) + \beta _1g_1(\beta _1)\) in \(O(\log |{H}|)\) operations. For the left-hand side, \(\mathbf {V}\) queries \(f_1\) and \(f_2\) at \(\beta _1\) and then needs to ask help from \(\mathbf {P}\) to evaluate \({r}(\alpha ,\beta _1) f_1(\beta _1)-{r}_{{M}}(\alpha , \beta _1) f_2(\beta _1)\). The reason is that while \({r}(\alpha ,\beta _1)\) is easy to evaluate (it requires \(O(\log |{H}|)\) operations), \({r}_{{M}}(\alpha ,\beta _1) = \sum _{{\kappa }\in {H}} {r}(\alpha , {\kappa }) \hat{{M}}({\kappa }, \beta _1)\) in general requires \(\varOmega (|{H}| |{K}|)\) operations.

We thus rely on the univariate sumcheck protocol again. We define

$$\begin{aligned} q_2({X}) := {r}(\alpha , {X}) \hat{{M}}({X}, \beta _1) \end{aligned}$$
(4)

\(\mathbf {V}\) sends \(\beta _1\) to \(\mathbf {P}\), and then \(\mathbf {P}\) replies with the sum \(\sigma _2 := \sum _{{\kappa }\in {H}} {r}(\alpha , {\kappa }) \hat{{M}}({\kappa }, \beta _1)\) and the polynomials \(g_2({X})\) and \(h_2({X})\) such that \(q_2({X}) = h_2({X}) v_{{H}}({X}) + {X}g_2({X}) + \sigma _2/|{H}|\). In order to check this polynomial identity, \(\mathbf {V}\) samples a random element \(\beta _2\in \mathbb {F}\) with the intention of checking the identity at \({X}:= \beta _2\). For the right-hand side, \(\mathbf {V}\) queries \(g_2\) and \(h_2\) at \(\beta _2\), and then evaluates \(h_2(\beta _2) v_{{H}}(\beta _2) + \beta _2g_2(\beta _2) + \sigma _2/|{H}|\) in \(O(\log |{H}|)\) operations. To evaluate the left-hand side, however, \(\mathbf {V}\) needs to ask help from \(\mathbf {P}\). The reason is that while \({r}(\alpha ,\beta _2)\) is easy to evaluate (it requires \(O(\log |{H}|)\) operations), \(\hat{{M}}(\beta _2, \beta _1)\) in general requires \(\varOmega (|{K}|)\) operations.

We thus rely on the univariate sumcheck protocol (yet) again: \(\mathbf {V}\) sends \(\beta _2\) to \(\mathbf {P}\), and then \(\mathbf {P}\) replies with the value \(\sigma _3 := \hat{{M}}(\beta _2, \beta _1)\), which the verifier must check. Note though that we cannot use the sumcheck protocol directly to compute the sum obtained from Eq. (1):

$$\begin{aligned} \hat{{M}}(\beta _2, \beta _1) = \sum _{{\kappa }\in {K}} u_{{H}}(\beta _2, \hat{{\mathsf {row}}}({\kappa })) u_{{H}}(\beta _1, \hat{{\mathsf {col}}}({\kappa })) \hat{{\mathsf {val}}}({\kappa }). \end{aligned}$$

The reason is because the degree of the above addend, if we replace \({\kappa }\) with an indeterminate, is \(\varOmega (|{H}||{K}|)\), which means that the degree of the polynomial \(h_3\) sent as part of a sumcheck protocol also has degree \(\varOmega (|{H}||{K}|)\), which is not within our budget of an AHP with proof length \(O(|{H}|+|{K}|)\). Instead, we make the minor modification that in the earlier rounds \(\beta _1\) and \(\beta _2\) are sampled from \(\mathbb {F}{\setminus }{H}\) instead of \(\mathbb {F}\), and we will leverage the sumcheck protocol to verify the equivalent (well defined) expression from Eq. (2):

$$\begin{aligned} \hat{{M}}(\beta _2, \beta _1) = \sum _{{\kappa }\in {K}} \frac{v_{{H}}(\beta _2) v_{{H}}(\beta _1)\hat{{\mathsf {val}}}({\kappa })}{(\beta _2- \hat{{\mathsf {row}}}({\kappa }))(\beta _1- \hat{{\mathsf {col}}}({\kappa }))}. \end{aligned}$$

This may appear to be an odd choice, because if we replace \({\kappa }\) with an indeterminate in the sum above, we obtain a rational function that is (in general) not a polynomial, and so does not immediately fit the sumcheck protocol. Nevertheless, we are still able to use the sumcheck protocol with it, as we now explain.

Define \(f_3({X})\) to be the (unique) polynomial of degree less than \(|{K}|\) such that

$$\begin{aligned} \forall \, {\kappa }\in {K},\; f_3({\kappa }) = \frac{v_{{H}}(\beta _2) v_{{H}}(\beta _1) \hat{{\mathsf {val}}}({\kappa }) }{ (\beta _2- \hat{{\mathsf {row}}}({\kappa }))(\beta _1- \hat{{\mathsf {col}}}({\kappa })) }. \end{aligned}$$
(5)

The prover computes the polynomials \(g_3({X})\) and \(h_3({X})\) such that

$$\begin{aligned} f_3({X})&= {X}g_3({X}) + \sigma _3/|{K}|, \\ h_3({X}) v_{{K}}({X})&= v_{{H}}(\beta _2) v_{{H}}(\beta _1) \hat{{\mathsf {val}}}({X}) - (\beta _2- \hat{{\mathsf {row}}}({X}))(\beta _1- \hat{{\mathsf {col}}}({X}))f_3({X}). \end{aligned}$$

The first equation demonstrates that \(f_3\) sums to \(\sigma _3\) over \({K}\), and the second equation demonstrates that \(f_3\) agrees with the correct addends over \({K}\). These two equations can be combined in a single equation that involves only \(g_3({X})\) and \(h_3({X})\):

$$\begin{aligned} h_3({X})&v_{{K}}({X}) = v_{{H}}(\beta _2) v_{{H}}(\beta _1) \hat{{\mathsf {val}}}({X})\\&- (\beta _2- \hat{{\mathsf {row}}}({X})) (\beta _1- \hat{{\mathsf {col}}}({X})) ({X}g_3({X}) + \sigma _3/|{K}|). \end{aligned}$$

The prover thus only sends the two polynomials \(g_3({X})\) and \(h_3({X})\). In order to check this polynomial identity, \(\mathbf {V}\) samples a random element \(\beta _3\in \mathbb {F}\) with the intention of checking the identity at \({X}:= \beta _3\). Then \(\mathbf {V}\) queries \(g_3\), \(h_3\), \(\hat{{\mathsf {row}}}\), \(\hat{{\mathsf {col}}}\), \(\hat{{\mathsf {val}}}\) at \(\beta _3\), and then evaluates \( v_{{H}}(\beta _2)v_{{H}}(\beta _1)\hat{{\mathsf {val}}}(\beta _3) - (\beta _2- \hat{{\mathsf {row}}}(\beta _3))(\beta _1- \hat{{\mathsf {col}}}(\beta _3)) (\beta _3g_3(\beta _3) + \sigma _3/|{K}|) = h_3(\beta _3) v_{{K}}(\beta _3)\) in \(O(\log |{K}|)\) operations.

If this third test passes then \(\mathbf {V}\) can use the value \(\sigma _3\) in place of \(\hat{{M}}(\beta _2, \beta _1)\) to finish the second test. If this latter passes, \(\mathbf {V}\) can in turn use the value \(\sigma _2\) in place of \({r}_{{M}}(\alpha ,\beta _1)\) to finish the first test.