Boosting Lasso+Jolt through faster commitments – with far-reaching consequences

Justin Thaler

Earlier this year, we introduced Lasso and Jolt, which promise to yield SNARKs that are not only much more performant but also easier to build and audit.

Lasso is a new lookup argument with a dramatically faster prover. Jolt builds on Lasso to offer a fundamentally new paradigm for designing so-called zkVMs – SNARKs enabling a prover to prove that it correctly ran a computer program (specified in the assembly language of some specific virtual machine). 

Both Lasso and Jolt have at their core a technical tool called the sum-check protocol, which they use to minimize the amount of data (and the “size” of each piece of data) that the prover has to commit to. The cost of committing to data is a key bottleneck for SNARK prover performance. 

Today, Ben Diamond and Jim Posen of Ulvetanna (henceforth, D&P) have released a paper that changes the game further. D&P’s central contribution is a modification of the Ligero/Brakedown commitment scheme that allows the prover to roughly “pay by the bit” for each committed value. I’ll explain how they achieve this, and why, in combination with Lasso, it leads to dramatically faster SNARKs. D&P call their implementation Binius.

These developments not only enable better performance. They also show that we’ve been incorrectly conceptualizing key aspects of SNARK design. For example, Jolt already showed that when we hand-design “SNARK-friendly” VMs, we’ve actually been tailoring them to the artificial limitations of today’s popular SNARKs. D&P show that the same goes for SNARK-friendly hashes. At the end of this post, I’ll also unpack what needs to change about SNARK design, both in terms of how we think about it and what we build. 

(I have posted two separate and more modest notes that complement D&P’s contributions. I also wrote a more technical companion FAQ that answers questions I’ve encountered when discussing these works.)

Better SNARKs for Circuit-SAT via Lasso

Jolt can be viewed as an application of Lasso to VM execution, which by definition consists of executing the simple fetch-decode-execute cycle of the VM over and over again. However, Lasso itself is much more generally applicable. 

The first thing that D&P do is use Lasso to give a better SNARK for circuit satisfiability (more accurately, the “circuits” supported are Plonkish constraint systems). This can be viewed as applying Lasso to (potentially) non-uniform computations (unlike VM execution, which is inherently uniform, in that a VM repeatedly executes the same fetch-decode-execute cycle). This application exploits that Lasso is actually something more powerful than a lookup argument: It’s an argument for general sparse linear relations, meaning for establishing that M ⋅ t = a, where M is a sparse matrix (i.e., most of its entries are zero) and t and a are (possibly committed) vectors. 

As with Jolt, this application of Lasso has the important property that almost all of the values the prover has to cryptographically commit to are small, meaning they are between 0 and 2b for some number b that is not very large (we refer to b as the bit-complexity of each committed value). For example, in Jolt, b ≤ 25 for most committed values. 

In a separate note, Srinath Setty and I present an alternative SNARK for (a generalization of) Plonkish that we feel is conceptually simpler but may be less easy to integrate with D&P’s subsequent contributions.

These new Lasso-based SNARKs for Plonkish constraint systems will make it easier to integrate the techniques underlying Lasso into existing tooling (like Halo2), much of which specifically supports Plonkish. 

A faster polynomial commitment scheme for small values

First, some background. As explained in my previous posts, most SNARKs consist of two components: A so-called polynomial IOP, and a cryptographic primitive called a polynomial commitment scheme. The key prover bottleneck is typically the polynomial commitment scheme. 

Two broad classes of polynomial commitment schemes are popular today. One class is based on elliptic curve cryptography, with the most popular examples being KZG commitments and Bulletproofs/IPA. The other is based on hashing, with the most popular example currently being FRI

For both curve-based and hashing-based commitment schemes, it is faster to commit to small values than big values. But the cost for the prover as a function of the size of the values is a step function: As the size of the values grows, the cost mostly stays flat, with occasional dramatic jumps. 

In the curve-based case, this is due to Pippenger’s algorithm, which roughly allows the prover to commit to 1-to-20-bit values with a single group operation, 20-to-40-bit values with two group operations, 40-to-60-bit values with three group operations, and so forth. (Here, 20 is a stand-in for “the logarithm of the length of the committed vector,” which is often between 20 and 30, and the aforementioned commitment costs are amortized.) With hashing-based commitment schemes, many projects today work over extension fields – large fields that have a much smaller base field inside of them. Commitments are faster to compute if all committed values reside in the base field. Popular base fields today are between 31 and 64 bits.

Nonetheless, the extent to which today’s commitment schemes benefit from small values is limited. In the curve-based case, committing to a 2-bit-value (i.e., a value in {1, 2, 3, 4}) is no faster than committing to a 20-bit value (i.e., a number between 0 and 220 ≈ 1 million): Both require roughly one group operation via Pippenger’s algorithm. Something similar holds for today’s popular hashing-based schemes, which treat all base field elements more or less the same. (See item 14 in my accompanying technical FAQ for details.)

D&P smooth out this step function for the Ligero/Brakedown hashing-based polynomial commitment scheme, allowing the prover to roughly “pay by the bit” for each committed value. They accomplish this through a combination of two techniques. First, they work over the field GF[2128], which has characteristic two (a substantial departure from today’s popular SNARKs, which have characteristic at least 231). They specifically construct this field as a tower, meaning that GF[2128] is constructed as an extension of GF[264], which is constructed as an extension of GF[232], which is constructed as an extension of GF[216], and so on. Second, they use techniques related to code concatenation to ensure that committing to a 1-bit value is really about 2x cheaper than committing to a 2-bit value, which is about 2x cheaper than committing to a 4-bit value, and so forth. (See my more technical companion FAQ for a sketch of the techniques.)

The effect can be substantial. In D&P’s current SNARK for Keccak, all of the committed values are individual bits, and their improvements speed up the time to commit to these one-bit values by over an order of magnitude. As another example, in Jolt, between one-third and two-thirds of committed values are just a single bit. (See Figure 4 of the Jolt paper for a breakdown of the size of committed values.) 

Dramatically better SNARKs for hashing

These first two contributions yield a dramatically faster SNARK for circuit-satisfiability – likely one or even two orders of magnitude faster than what one could build from prior work – when applied to circuits over fields of characteristic two. (See my technical companion FAQ for more on the technical benefits of working over characteristic-two fields.) 

D&P then apply this SNARK to a circuit implementing the hash function Keccak (and another one called Grøstl), yielding much faster SNARKs for these hash functions than anything that has come before. Once fully optimized, I expect their SNARKs to prove over 5,000 Keccak-f permutations per second single-threaded (on the machine used in their benchmarks, which is an AWS c7i.16xlarge instance with an Intel Sapphire Rapids 8488C processor), with ample parallelism available. This means 20-25 MiB of data can be hashed with about 30 seconds of single-threaded processing. 

Consequences 

Better SNARKs for hashing are powerful in and of themselves because many cryptographic statements to which SNARKs are applied (such as knowledge of Merkle-authentication paths) boil down to hashing. Indeed, this is the key bottleneck in Type-1 zkEVMs as well as in recursing hashing-based SNARKs. 

Accordingly, while Lasso enables the better SNARKs for hashing, it (as well as Jolt) also benefits from them: The better SNARKs for hashing enable Lasso and Jolt to be combined with (recursive versions of) the Ligero/Brakedown hashing-based polynomial commitment schemes. These schemes have an extremely fast prover but require the verifier to do a somewhat large number of hash operations (square root in the size of the committed polynomial). This has made recursive application of SNARKs using Ligero/Brakedown difficult because it has been expensive for the prover to prove that it correctly performed the verifier’s hash operations. This difficulty should disappear given much faster SNARKs for hashing. 

Concretely, for circuits with several billion gates, Ligero/Brakedown proofs are on the order of MBs, with the verifier V mainly performing field operations (which are cheap for recursion) and hashing. Given SNARKs that can hash 20 MBs of data with 30 seconds of single-threaded processing (and ample parallelizability), it should take the prover less than 10 seconds of single-threaded processing time to apply recursion to this proof system. 

On top of that, when using recursion, it doesn’t make sense to optimize Ligero/Brakedown for proof size, but rather for the runtime of the recursive prover. Since hash operations, compared to field operations, are much more expensive to prove, one should configure Ligero/Brakedown to have V do less hashing, even though that means bigger proofs and more field operations. This will speed up recursive proving beyond the aforementioned estimate of 10 seconds. 

In summary, combining Lasso and Jolt with (recursive application of) D&P’s commitment scheme will further improve Lasso’s and Jolt’s performance. This is both because Keccak and Grøstl are faster hash functions than today’s popular SNARK-friendly ones, and because, regardless of the hash function used, Ligero/Brakedown has a faster prover than FRI (see #7 in my technical FAQ). 

Revising our view of SNARK-friendly hashes 

Jolt showed that most zkVM projects have been wrong about what they consider to be “SNARK-friendly” VMs. The key property that makes an instruction “SNARK-friendly” is something called decomposability. Real instruction sets, not hand-designed for supposed SNARK-friendliness, naturally satisfy this property. We’ve been hand-designing VMs to be friendly to the limitations of today’s popular SNARKs, but these limitations are artificial. D&P show that the same goes for SNARK-friendly hashes. (Fully optimizing D&P’s SNARKs for standard hash functions, and comparing their performance to those that are popular today when applied to ostensibly SNARK-friendly hashes, is near-term future work). 

Faster sum-check prover over small fields

D&P’s commitment costs are now so low that in their current implementation of their SNARK for Keccak, the prover bottleneck is no longer the polynomial commitment scheme, but rather the sum-check protocol (the technical core of the polynomial IOPs underlying Lasso and Jolt). 

However, existing sum-check prover implementations, including D&P’s, have not been optimized to take advantage of the fact that the values being summed are all “small.” That is, existing sum-check-prover implementations perform few field operations, but most of those operations happen in the “whole field” GF[2128], even when most of the values being summed reside in a much smaller subfield, like GF[2], GF[28], or GF[216].

In my separate note, I have optimized the sum-check prover to take advantage of small values. Depending on just how small the values are, this can improve the sum-check prover’s work from between a factor of about 2, to multiple orders of magnitude. 

Incorporating these optimizations into D&P’s implementations is near-term future work. 

The role of interaction in SNARK design

The SNARKs that today are widely characterized as having state-of-the-art prover performance use minimally interactive (i.e., constant-round) polynomial IOPs, combined with highly interactive polynomial commitment schemes, namely FRI. (Bulletproofs/IPA is also highly interactive, although not as often associated with a fast prover.) This is the reverse of the way fast-prover SNARKs should be designed. Interaction in the polynomial IOP can be exploited to reduce prover costs, while interaction in the polynomial commitment scheme is exploited to reduce verifier costs, often at the expense of prover costs. 

Since prover costs are the key scalability bottleneck for SNARKs, interaction in the polynomial IOP is the key to more scalable SNARKs, while large amounts of interaction in the polynomial commitment scheme can actually hurt scalability. 

In more detail, FRI (and Bulletproofs/IPA) exploit interaction to minimize proof size, but this comes at the expense of prover time (see #7 in my technical FAQ). Instead, we should aim for the fastest possible prover, even if this means obtaining only slightly sublinear size proofs. We can then apply recursion to bring the proof size down. This is exactly what is done in the Ligero/Brakedown polynomial commitment, which involves just a single round of interaction, and has a very fast prover (but square-root-sized proofs). Before D&P’s work, it was hard to recurse SNARKs using these commitment schemes, as the Ligero/Brakedown verifier has to perform so many hash evaluations. In fact, some works like Orion simply “leave the verifier’s hashes out of the recursion,’’ limiting the reduction in proof size and verifier work. But with much faster SNARKs for proving hash evaluations, this issue disappears. 

Multiple rounds of interaction within the polynomial IOP do dramatically help the prover. Specifically, the sum-check protocol exploits many rounds of interaction to minimize commitment costs for the prover. 

At a lower level, the sum-check protocol uses multivariate polynomials and many rounds of interaction, while today’s most popular polynomial IOPs use univariate polynomials and few rounds of interaction. The univariate approach achieves the same functionality as the multivariate one, but at the crucial expense of requiring the prover to commit to large quantities of extra data.

What’s going on here is that the sum-check protocol has the prover do extra field operations (which are fast) to reduce the amount of cryptographic operations (which are slow) that the prover uses within the commitment scheme. This is a win for prover time. In contrast, highly interactive commitment schemes have the prover do extra cryptographic operations (relative to minimally interactive ones), not to reduce the prover’s work but rather verifier costs like proof size and verifier work. It’s better to use recursion rather than interaction to push these verifier costs onto the prover, with only a low-order increase in prover time (now that we have fast enough SNARKs for hashing).

Synopsis. We should exploit interaction within the polynomial IOP to minimize the amount of data the prover has to commit. The polynomial commitment scheme used to commit that data should be as fast as possible for the prover, subject to having sublinear verification costs. Just one round of interaction suffices for this. The verification costs can then be reduced through recursion, without introducing new prover bottlenecks.

Summary of new developments

  • D&P uses Lasso to give a better SNARK for circuit-SAT. This SNARK can use any commitment scheme for multilinear polynomials.  
  • Separately, they give a faster polynomial commitment scheme (Ligero/Brakedown instantiated over fields of characteristic two). Roughly speaking, by using binary tower fields and code concatenation, they ensure that committing to very small values is extremely fast (at least an order of magnitude faster than prior work). This scheme is very fast for the prover but has somewhat large verification costs (the verifier does a lot of hashing). 
  • These two developments combine to give extremely fast SNARKs for standard hash functions like Keccak (again, with somewhat large verification costs).
  • The faster SNARKs for hashing enable recursion on these SNARKs, despite their somewhat large verification costs. 
  • Recursion, in turn, solves the problem of large verification costs.

Implications for the SNARK ecosystem

These new works mean that, to obtain SNARKs with performant provers, we should change essentially every component of today’s deployments, including:

  • Polynomial IOPs. They should be sum-check-based to minimize the amount of data the prover needs to commit to. 
  • Polynomial commitment schemes. We should combine faster-prover, bigger-proof schemes like Ligero/Brakedown with recursion. Ligero/Brakedown has precisely the same security properties as FRI (they are transparent, plausibly post-quantum secure, based only on hashing, etc.)
  • Hash functions. I’d advocate for using hash functions like Keccak and Grøstl, which can be proved at least as quickly as today’s supposed “SNARK-friendly” ones. If we do want to cook up hash functions with the goal of making them even friendlier to SNARKs, we’ll have to start from scratch in light of our improved understanding of the actual power and limitations of performant SNARKs. 
  • Instruction sets for zkVMs. We should use standard instruction sets such as RISC-V rather than ones designed around the limitations of previous proving systems. And we shouldn’t hand-design circuits implementing each instruction. Rather, zkVM designers should simply specify the evaluation table of each instruction and use a sum-check-based lookup argument like Lasso.
  • The fields they work over. For a variety of technical reasons, today’s popular SNARKs require fields of relatively large characteristic (deployments typically use characteristic at least 231). SNARKs based on the sum-check protocol do not have this limitation, and D&P show how to exploit fields of very small characteristic like GF[2128] for major gains in performance. 

Fortunately, the same developments that necessitate these changes also lead to SNARKs that are simpler and easier to build (although there is still room for further improvement). In particular, Jolt eliminates the need to hand-design instruction sets for zkVMs or to hand-optimize circuits implementing those instruction sets because it replaces those circuits with a simple specification of the evaluation table of each primitive instruction. This modular and generic architecture makes it easier to swap out fields and polynomial commitment schemes and implement recursion, and generally reduces the surface area for bugs and the amount of code that needs to be maintained and audited. 

This simplicity is essential not only for usability and speed of development. It helps address a major security issue. SNARK-based systems consisting of many tens of thousands of lines of code, that require understanding multiple custom constraint systems or DSLs, will never be sufficiently auditable to secure billions of dollars of value. 

***

I hope this post convinces more projects to invest in developing SNARKs that follow this design paradigm. There is a lot of building to do. 

In the immediate future, some of the claims I am making still need to be fully verified via implementation (e.g., comparing D&P’s SNARKs for Keccak to those for ostensibly SNARK-friendly hashes, and fully implementing recursion to bring the proof size down). Meanwhile, our preliminary Jolt implementation (with curve-based commitment schemes) is nearly complete. Once finished, it will be worth re-implementing Jolt to use D&P’s hashing-based commitment schemes. This is somewhat involved, mainly because switching from a large prime field to a field of characteristic two necessitates redefining all the lookup tables to which Lasso is applied. I also hope that the new Lasso-based SNARKs for Plonkish circuits eases the integration of Lasso into existing tooling, as it enables the circuits that people have already written to be fed into it. 

These are obvious next steps. I’m excited to see what happens once the community fully absorbs the power of the sum-check protocol to minimize commitment costs. 

***

Justin Thaler is Research Partner at a16z and an Associate Professor in the Department of Computer Science at Georgetown University. His research interests include verifiable computing, complexity theory, and algorithms for massive data sets.

***

The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein. 

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investment-list/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures/ for additional important information.