Table of contents
- 1. The sum-check protocol and the Binius commitment scheme
- 2. Sum-check, lookups, performance, and simplicity
- 3. Elliptic curves, again
- 4. Precompiles and zkVM benchmarks
We’ve been making steady progress on Lasso and Jolt – our new, simple, performant lookup argument and zkVM — starting with the theory papers and Lasso implementation last summer, and since releasing a fully open source implementation of Jolt last month.
The implementation has shown Jolt’s promise relative to existing technology and challenged much of the conventional wisdom in SNARK design. Since its release, we have added support for the Rust standard library, incorporated improvements from over 10 contributors and merged nearly 50 pull requests, and improved the modularity and extensibility of the codebase.
While we continue to enhance Jolt, I wanted to address questions, clarify misconceptions, and share my perspective on key issues. The four areas I explore here are (1) the relationship between the sum-check protocol and the Binius commitment scheme, (2) the role of sum-check and lookups in Jolt, (3) elliptic curves versus hashing, and (4) precompiles as they relate to zkVMs.
Commitment schemes are often highlighted as the key building block of a SNARK. But it’s also important to remember the vital role of the other component: the polynomial IOP. For instance, the Binius commitment scheme for multilinear polynomials represents a major advance, but it must be paired with a polynomial interactive oracle proof (PIOP) to prove that the committed data actually validates the prover’s claims.
Binius’ commitment is specifically compatible with PIOPs that use the sum-check protocol. This is due to reasons both clear (sum-check relies on multilinear polynomials, not univariate ones; and FRI-Binius even uses sum-check internally) and nuanced (sum-check PIOPs naturally operate across fields of any characteristic, which is crucial for taking full advantage of Binius’ novel performance traits). Binius’ commitment is not compatible with the PIOPs that are most common today, which unfortunately don’t use sum-check.
Designing a fast PIOP requires more insight than just the phrase “apply sum-check”. Binius uses over a decade’s worth of work employing the sum-check protocol to implement prover-efficient polynomial IOPs. Sections 4 and 5 of the Binius paper are devoted to designing new highly efficient sum-check-based PIOPs to combine with the commitment scheme.
The Binius commitment and Jolt go together like peanut butter and jelly because Jolt is the only zkVM today that is exclusively based on the sum-check protocol. Today, Jolt uses commitment schemes based on elliptic curve cryptography, but incorporating the Binius commitment into Jolt is our highest priority.
What sets Jolt apart? Is it that it is the first (and currently only) zkVM that exclusively uses sum-check-based polynomial IOPs — or is it that Jolt implements the lookup singularity (i.e., it does almost everything with lookups rather than constraint systems or circuits)? The answer is, It’s both. Most of Jolt’s simplicity benefits over prior zkVMs come from lookups; its performance benefits come from both its use of lookups and of sum-check.
The lookup-only approach is better for some instructions — ones that don’t have very small circuits — but can be worse for others that do have very small circuits, such as the bitwise-XOR instruction when working over a field of characteristic two, which corresponds to native field addition. But overall, the lookup-only approach is a net positive for performance, at least when working over 256-bit fields. Today, the Jolt prover spends about 20% of its time on these “instruction execution” lookups, and 40% of its time proving constraints. Adding more constraints to reduce lookups would not be helpful.
Roughly speaking, Jolt uses lookups to implement both the “fetch” and “execute” parts of the CPU’s fetch-decode-execute loop. These lookups are fast enough that most of the prover’s time is spent proving it ran “decode,” which is handled through traditional constraints.
The lookup-only approach also results in a simpler and more auditable implementation. These benefits are difficult to quantify, and take time to recognize and appreciate. But Jolt does well on crude proxies such as lines of code (the Jolt codebase is about 25,000 lines of code, which is 2x to 4x fewer than prior RISC-V zkVMs) and development time. Such improvements are much harder to obtain than performance ones: While I expect zkVM provers to be nearly 100x faster in the coming months than they were in August 2023, it’s hard to imagine a zkVM with even 10x fewer lines of code
Public discourse undervalues the benefits of having a fast zkVM that works over elliptic curves, partly due to the justified enthusiasm for hashing-based commitment schemes like Binius.
When proving statements about elliptic curve cryptography, a curve-based zkVM can avoid non-native field arithmetic that adds hundred-to-thousand-fold overheads to proving time. Such applications include proving knowledge of many digital signatures (the main work relevant to blockchain light clients and SNARK-based bridges); aggregating Plonk/Groth16/Nova/Honk proofs; and proving knowledge of Verkle tree authentication paths.
I am optimistic that the community will converge around sum-check–based PIOPs combined with the FRI-Binius commitment scheme as the right approach to performant SNARKs in many applications. Even if this happens, there will still be a role for fast curve-based SNARKs, unless the world moves away from elliptic curve cryptography entirely (e.g., after society completes a shift away from cryptosystems that are not quantum-secure).
To summarize:
There’s been some discussion about precompiles and their role in zkVMs and in benchmarking. Before I explain, it’s helpful to unpack what precompiles are, as the term’s meaning varies by context.
“Precompiles” in Ethereum. In the Ethereum Virtual Machine (EVM), precompiles are operations that are performed frequently and are natively supported to enhance efficiency. This avoids the substantial overheads and exorbitant gas costs associated with executing these operations through lengthy sequences of EVM opcodes.
The distinction between “EVM precompiles” and “primitive instructions” (opcodes) is primarily semantic. For example, the Keccak hash function is an EVM opcode, while SHA-2 is an EVM precompile. Both precompiles and opcodes are frequently executed operations that Ethereum natively supports for identical purposes: optimizing efficiency and gas costs. Precompiles are undeniably part of the EVM, which is commonly used to describe the Ethereum execution environment broadly, encompassing much more than just opcodes.
Why does the EVM even have precompiles if they function essentially the same as opcodes? It’s mainly just a matter of convention. One other possible reason is that precompiles consist of relatively complicated operations like cryptographic primitives that may need to change in the future, and it will be easier to change them if they are not assigned an opcode.
“Precompiles” in zkVM design. In zkVM design, a precompile refers to a special purpose SNARK targeted at a specific functionality like Keccak or SHA hashing, or a specific elliptic curve group operation. Today, SNARK precompiles are typically implemented via hand-optimized constraint systems (although as the community switches to sum-check-based SNARKs, the nature of these constraint systems and how they get proved will change).
The parallels between EVM precompiles and zkVM precompiles run deep. Before Jolt, zkVMs implemented primitive instructions via hand-optimized constraints systems, one per instruction, exactly as they implement precompiles. The difference between what gets called a zkVM precompile and what gets called a primitive instruction is purely semantic. There is no actual difference between them.
In Jolt, we implement primitive instructions with lookups instead of traditional constraints. But there are no major issues with choosing to implement some primitive instructions via constraints. (In fact, lookups can even be viewed as a type of constraint.) Indeed, as I’ve said previously, we will likely have to use traditional constraints to implement RISC-V addition and multiplication once we switch to the Binius commitment scheme over fields of characteristic two.
With that background, here are my views on precompiles as they relate to zkVMs and benchmarking.
First, benchmarking various RISC-V zkVMs without precompiles is exactly what it means to benchmark RISC-V zkVMs. The term “zkVM” is an informal one, so there’s room for disagreement, but in my view, a RISC-V zkVM with one or more precompiles is no longer a zkVM for RISC-V: It is a zkVM for the new instruction set based on RISC-V by adding each precompile as a primitive instruction. At a minimum, each precompile added to a zkVM erodes the value proposition of the zkVM paradigm — every additional circuit adds surface area for bugs, and existing programs won’t be able to leverage the new precompile out-of-the-box.
Some also conflate the notion of EVM precompiles for zkEVMs with the notion of precompiles for zkVMs. But these are very different things. While a few key operations for zkEVMs — such as Merkle hashing and digital signature verification — are indeed more complicated than primitive RISC-V instructions, that doesn’t change the fact that there is no functional difference between EVM precompiles and primitive EVM instructions. zkEVMs must support EVM precompiles to claim parity with the EVM. In other words, a zkEVM that does not support EVM precompiles is not analogous to a RISC-V zkVM like Jolt, where precompiles would be used to expand the instruction set beyond RISC-V.
Another concern is how to choose a “fair” set of functionalities on which to benchmark the zkVMs. But for RISC-V zkVMs, any set of functionalities is fair. The prover time is almost entirely determined simply by the number of cycles that the RISC-V CPU runs for, for two reasons. First, the prover spends a small fraction of its time on the “execute” part of the “fetch-decode-execute” loop. Second, different RISC-V instructions, as well as memory accesses, all have reasonably similar proving times. (In Jolt, they are all handled with offline memory checking techniques.)
Finally, if precompiles are brought into the picture, Jolt will likely not do any worse than alternatives. Indeed, I expect it to do better because sum-check-based precompiles will be the fastest, and can be integrated into Jolt without overhead due to its exclusive use of sum-check-based PIOPs. On this point, some people expressed concern that precompiles using elliptic-curve commitment schemes will be much worse than those using hashing-based schemes. Jolt uses curves today, but this is not essential, and we’ve been very open about our plans to switch over to Binius.
Our main goal in benchmarking is to ascertain the intrinsic performance profiles of different proof systems, to the extent that they can be isolated from their implementations. This approach allows the community to understand and converge upon the right techniques for designing performant and secure SNARKs. But when attempting to compare two different SNARKs, there are endless confounding factors that often make an “apples to apples” comparison all but impossible.
Engineering effort represents one of these confounding factors, although much of the community appears to take an opposite view. The thinking seems to go something like this: if a project has added “features” like precompiles or optimized implementations for specific hardware, then it should get “credit” in any benchmarking.
Both viewpoints have merit. But taken too far, the latter viewpoint is clearly untenable. New approaches will forever be at a disadvantage in any benchmark simply because those implementing them have not had time commensurate with older projects. This holds back progress.
Over time, I expect the confounding factors in benchmarking to ease. As tooling to build SNARKs matures, SNARKs will require less engineering effort to get decent performance. And it’s a minor miracle that — at least for RISC-V — zkVM costs depend mainly on cycle counts rather than idiosyncrasies of any particular application. If people converge on a choice of constraint system (as opposed to today’s fractured state of R1CS, AIR, Plonkish, etc.), something similar may happen for SNARKs targeted at constraint systems, with simple measures of constraint-system size standing in for cycle counts.
Until then, it will be difficult to strike the right balance between under- and over-controlling for confounding factors. Disagreements will be inevitable, and builders will have to provide the full context, details, and rationales behind any benchmarks so the community can understand and discuss them.
****
Justin Thaler is Research Partner at a16z and an Associate Professor in the Department of Computer Science at Georgetown University. His research interests include verifiable computing, complexity theory, and algorithms for massive data sets.
***
The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.
This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investment-list/.
Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures/ for additional important information.