Getting the bugs out of SNARKs: The road ahead

11.20.24

In the blockchain world, SNARKs hold the promise of unparalleled scalability and enhanced privacy compared to today’s transparent ledgers. Yet the current reality is that SNARK implementations are riddled with bugs, providing security through obscurity rather than robust cryptographic guarantees. Achieving true security requires formal verification — proof that SNARK systems are bug-free. This post briefly reviews progress on Jolt, our zkVM for RISC-V, and then discusses our efforts toward formal verification.

zkVMs and Jolt: A short recap

zkVMs are SNARKs that allow an untrusted prover to prove that it correctly ran some specified computer program on a witness, where the program is specified as bytecode conforming to a particular instruction set architecture (ISA) like RISC-V.

zkVMs offer the most promising path to usable and maintainable SNARKs. They allow developers to write programs in whatever high-level language they please, so long as the language comes with a compiler down to the ISA’s bytecode. Developers need not express computations directly as circuits or constraints, or even within “zkDSLs.” Those building the zkVMs will deal with the nastiness of constraint systems and circuits once and for all, meaning that anyone can then use the zkVM without any knowledge of how the proof machinery actually works.

In April 2024, we released an initial implementation of the Jolt zkVM for RISC-V and showed its performance compares favorably to alternatives. Our initial findings have since been confirmed by several other benchmarks. Since then, we have dramatically reduced Jolt’s proof size (from several MBs to 200 KBs, and as low as 35 KBs at the cost of slightly increased prover time) and incorporated additional features like support for the Rust standard library and the RISC-V Multiplication extension.

Soon, I expect Jolt to be the only RISC-V zkVM capable of proving in resource-constrained environments (under 2GB of prover space, compared to 8-16 GBs for RISC Zero and more for SP1), while also achieving state-of-the art prover time.

But this post is not about Jolt’s performance or feature support. It’s about a more important topic: correctness and security. This post lays out our roadmap toward the long-term goal of obtaining a formally verified Jolt implementation and describes the initial steps we have already taken along this path.

A key risk for SNARKs

Bugs and catastrophes

The slightest bug in a SNARK toolchain can lead to catastrophic security failures. For example, a single missing or improperly specified constraint could mean that the prover does not actually know the witness that it claims it knows (e.g., does not actually know the secret key controlling the Ethereum wallet initiating a purported blockchain transaction).

Until we have confidence that our toolchains are completely bug free, projects using SNARKs cannot really be secured by the SNARK itself. At best, they are secured by obscurity: finding the bugs may require expertise (to identify missing or malspecified constraints, or any other bug in the tool-chain used to deploy the SNARK). But the bugs are there — and that likely means a motivated adversary can find convincing proofs of false statements.

The very worst-case situation is that we think our SNARK toolchains are bug-free, and therefore stop treating SNARKs as but one of many layers of a system of defense-in-depth. These other layers of security will be semi-centralized. In the context of rollups, this may mean using an allowlist of who can submit proofs, the use of TEEs, or a security council that can “undo” acceptance of buggy proofs, likely at the substantial cost and chaos of also “undoing” many transactions that network participants believed to have been already confirmed.

The long-term prospects are bright, but far off

Over the long term, zkVMs will have implications not just for SNARK usability, but also for correctness. If we develop confidence that the zkVM itself is bug free, and ensure that the program bytecode on which the zkVM is applied correctly implements the intended witness-checking procedure, then the end-to-end toolchain will also be bug free.

The Ethereum Foundation is investing heavily to bring this endgame to reality — but it is years away. In fact, I consider it a distinct possibility that in five years’ time, we still don’t have strong confidence that any performative zkVM toolchain is actually bug free.

Today, zkVM toolchains are almost certainly full of bugs. We should stop pretending they’re not. To give a sense of how complicated they are, consider the following:

The Jolt codebase is up to 34,000 lines of Rust code (if you don’t count the SDK or the RISC-V emulator, it’s “only” about 25,000 lines), plus it uses some functionality from arkworks, namely field, group, and pairing operations. SP1 is over 60,000 lines of Rust code (closer to 100,000 if you include the proof system it invokes, plonky3), plus thousands of lines of Go code, and several hundred lines of assembly code. RISC Zero’s codebase is even bigger, with over 100,000 lines of Rust code, almost 200,000 lines of C++, 150,000 lines of CUDA, 400 lines of assembly, and more. (All of these statistics were obtained using the cloc –vcs git command to count lines of code.)

Of course, most of these lines of code are devoted to implementing the zkVM prover, yet for security guarantees only the zkVM verifier needs to be correctly implemented, and the verifier can be much simpler than the prover. Still, these statistics give a sense of the complexity of these systems, and even the verifier implementations are many thousands of lines of code.

Given this state of affairs, simplicity is a major virtue in any zkVM, and we firmly believe that Jolt is the simplest zkVM for RISC-V today. But formal methods are not some magic wand that can be waved at any piece of software and magically make all the bugs go away. Major technical challenges will have to be overcome to get any kind of guarantee that zkVM toolchains are end-to-end correct and secure. Simplicity in zkVMs is not only important in the short term; it will ease these formal verification efforts and remain a virtue for many years to come.

Roadmap to a bug-free Jolt

Near-term efforts (complete and in-progress)

In the intermediate term, there is no substitute for audits and bug-finding efforts. Jolt has not yet undergone a full audit, but it has been subjected to considerable scrutiny. Recently, the zkSecurity team found several bugs in Jolt. These bugs were not in the “guts” of the proof machinery but rather in how inputs and outputs are processed, how Jolt makes sure the prover actually ran the computer program to completion rather than “stopping it early,” and in allowing the prover to specify certain parameters that should instead be specified by the verifier or another honest party. (See zkSecurity’s companion post for additional details on the bugs and how they found them.)

Beyond bug-finding efforts, we have taken the first steps along the (very) long road to a formally verified Jolt implementation. First, one major reason for Jolt’s simplicity is its lookup-centric design: Rather than expressing each primitive instruction of the VM as a circuit or constraint system, Jolt instead “decomposes” each instruction into a handful of lookups into small lookup tables and then uses our Lasso lookup argument to verifiably perform the lookups. With Carl Kwan and Quang Dao, who are leading Jolt’s formal verification efforts, we have used the ACL2 automated theorem prover to verify the correctness of Jolt’s lookup decompositions. This formalization is highly automated, which means the techniques will be easy to extend to different, richer ISAs than RISC-V, should the community prove interested in them down the line. These verification efforts have also led to modest performance improvements to Jolt, by identifying, in an automated fashion, some unnecessary lookups we were doing for a handful of primitive RISC-V instructions.

Second, Quang and collaborators Gregor Mitscha-Baude from zkSecurity and Devon Tuma are using Lean to verify the soundness of the Spartan, Lasso, and Spice polynomial IOPs (PIOPs) at the heart of Jolt. This work will extend broadly to all PIOPs used in practice, not only those used in Jolt. This work will likely be complete by the middle of 2025. As with all of Jolt, it’s being done in the open, and you can follow along with Quang’s progress at his ZKlib repo. This effort is one of the projects recently funded by the Ethereum Foundation’s formal verification project.

A third step is to verify (assuming the soundness of the polynomial IOPs above) that the constraint system to which Spartan is applied in Jolt, combined with the lookups and reads/writes to memory that Jolt applies Lasso and Spice to, together capture the RISC-V ISA. This effort consists of first building a formal RISC-V model in ACL2 (which is done), and then using ACL2 to verify the equivalence of the model to Jolt’s combination of constraints, lookups, and memory operations (which is in progress).

Longer-term efforts

Converging on a theorem prover

We plan to port over all of our ACL2 formalizations to Lean. Lean is newer and less mature than ACL2 but has more community support/libraries for mathematics, and more momentum within the SNARK community (e.g., a large part of the Ethereum Foundations’s zkEVM verification effort is expected to be in Lean). Most of our initial efforts have used ACL2 in large part to take advantage of its extensive support of automated techniques like bit-blasting (e.g., bitvector reasoning via passing to a SAT solver) and rewriting. Lean recently added support for rudimentary bit-blasting, so long-term it may make sense to shift the entire effort to Lean.

Verifying the actual Jolt implementation

A major long-term challenge is to actually verify the Rust implementation of the Jolt verifier (much less an on-chain implementation of the same, via Solidity or EVM bytecode). We haven’t really begun to make headway here. Jolt is really just a polynomial IOP for proving correct execution of RISC-V programs, and our efforts to date have mainly focused on making sure the polynomial IOP is secure, not that the Rust implementation of Jolt correctly implements the polynomial IOP. The main exception to date: We have, via fuzzing, validated that the lookup decompositions we’ve formalized in ACL2 match the decompositions implemented in Jolt’s Rust codebase.

Here, we will need good tooling to ensure the correctness of Rust code (according to certain specifications). There are many potential approaches for enabling this including tools such as Restricted Algorithmic Rust (RAR), Aeneas, Verus , and others. Another possible approach is to circumvent the issue of Rust programming altogether. Many theorem provers (e.g., ACL2, Lean, Coq) support code extraction to certain target languages (e.g., OCaml, C, Lisp), and Rust may be supported in the future. A major issue with code extraction is that often the “cross-compilation” step is itself unverified. One unique advantage of ACL2 is that execution in the theorem prover is the same as native Common Lisp execution, so a “formally-verified Jolt verifier” in ACL2 can be called as needed, requiring no cross-compilation to a target language. Overall, I would not be surprised if we only seriously target this essential direction 1.5 years or more from now. A silver lining of this timeline is that Rust tooling will mature considerably in the meantime.

In the meantime, as we verify the correctness of various components of Jolt’s polynomial IOP and constraint systems using Lean/ACL2, we’ll extensively test for compliance of the verified models with Jolt’s Rust code. This is the approach that we took in verifying Jolt’s lookup decompositions. Though it falls well short of the end goal of a “formally verified Rust verifier,” it does let us gain confidence in the correctness of different parts of the Jolt codebase.

A DSL for all “constraints” used in Jolt

Unlike other zkVMs, which mainly use so-called Plonkish or AIR constraint systems to capture VM execution, Jolt uses a combination of rank-one constraints (R1CS) and “offline memory checking” techniques via Lasso and Spice that check that the prover is correctly performing reads into read-only memory (aka lookups) and read-write memory (aka registers and RAM), respectively.

Currently, the Jolt implementation specifies the above constraints/lookups/memory operations in ad hoc fashion. For R1CS specification, we were initially using Circom, but due to untenable overheads, we hand-rolled our own R1CS specification language and vertically integrated it into Jolt. Likewise, lookups and reads and writes of registers and RAM are specified via internal APIs.

For maintainability and continued verification as Jolt evolves, it would be good to have a single DSL that cleanly captures R1CS, lookups, and registers/RAM. This will also be useful for pre-compile specification. (Like Jolt itself, pre-compiles in Jolt will consist of a mix of R1CS, lookups, and arithmetic circuits to which the GKR protocol can be applied.) We are working with Galois to take steps in this direction.

Formal verification and controlling prover memory

A challenge for formal verification efforts is the heavy use of SNARK recursion in today’s zkVMs. To keep the prover’s space bounded (i.e., avoid needing terabytes of space to prove even toy program executions) and to reduce proof size, the standard procedure today is to break VM executions into “shards,” prove correct execution of each shard, and then recursively aggregate the proofs. This requires turning a SNARK verifier into a circuit or constraint system, and proving that one knows a satisfying assignment for the circuit. Any error in the specification of the circuit can lead to catastrophic security failures. And from a usability perspective, recursion simply makes it very hard for the public to tell what statement is actually being verified. No longer is the prover simply proving, “I ran this computer program correctly.” Rather, it is proving, “I know many proofs, one per shard, that each shard was run correctly, and the state of the machine at the end of shard i matches the state at the start of shard i+1”.

At least for sum-check-based SNARKs like Jolt, it is actually known how to keep the prover’s space bounded without SNARK recursion. We’ll flesh out the details in the context of Jolt in an upcoming e-print, and have begun implementing such a non-recursive space-bounded prover. It will take major engineering effort to achieve a performant implementation with this approach, but this will eventually pay off, leading to a simpler and easier-to-verify implementation.

Folding

Long term, I expect that “non-recursive space control” for SNARK provers will be an effective way to achieve a simple and performative prover when using hashing-based commitment schemes like Ligero, Brakedown, or FRI-Binius. But for the elliptic-curve-based commitment schemes supported by Jolt today, folding is the right way to minimize prover space while keeping the prover fast. Since folding inherently involves recursion, this will require formally verifying the correctness of a circuit representation of the Jolt verifier. This task may be complicated but will be helped by new tooling from the zkEVM formal verification project.

(I don’t expect that folding will be difficult to incorporate into Jolt, but it might be challenging to apply formal methods to confirm its correctness due to its application of recursive proving to a complicated verification procedure. Fortunately, lots of tools are being developed to help with this task.)

Verifying commitment schemes and Fiat-Shamir

Succinct arguments typically consist of two components: a PIOP and a polynomial commitment scheme. Together, this yields a succinct interactive argument, which is then rendered non-interactive via the Fiat-Shamir transformation. Efforts described above to formally verify the soundness of the Jolt PIOP are not enough on their own to guarantee that the Jolt SNARK is secure. We will also have to verify that the polynomial commitment schemes supported by Jolt are secure and correctly implemented, and also formally verify that applying Fiat-Shamir to the resulting succinct argument is secure and correctly implemented. This is a future goal of Quang and his collaborators’ work in Zklib, that is, to verify the soundness not only of PIOPs but also of polynomial commitment schemes and the PIOP-to-SNARK transformation.

Other Directions

Verified bytecode

Some projects have criticized RISC-V zkVMs by discussing bugs in compilers from high-level languages down to RISC-V bytecode, and leveling criticisms at Rust as a high-level language. Our take is if one wants verified RISC-V bytecode then one should verify the bytecode directly, or use a verified compiler. And if one doesn’t like Rust as a high-level language, then one is free to use any other high-level language that compiles down to RISC-V.

To be sure, additional tooling to produce verified RISC-V bytecode would be of massive value to zkVMs. But that is outside of the scope of the zkVM itself. The zkVM’s job is to make sure the bytecode it is handed is run correctly.

A verified implementation of the EVM via Jolt with precompiles

RISC-V zkVMs are a promising tool for building zkEVMs (EVM=Ethereum Virtual Machine). Jolt is well suited for this task, but for maximum performance this will entail incorporating a handful of pre-compiles into Jolt (and formally verifying the implementation indeed captures EVM semantics). This is on the Jolt roadmap, and the formal verification component will be able to take advantage of other efforts supported by EF’s zkEVM Formal Verification project.

***

I have serious concerns about the rush to deploy zkVMs while they still almost certainly contain bugs. The road to bug-free zkVMs is long, and extreme caution is warranted while we walk it. But we will get there eventually, and the tools and techniques developed along the way will pay rich dividends in computer science and mathematics well beyond SNARK design.

***

Justin Thaler is Research Partner at a16z and an Associate Professor in the Department of Computer Science at Georgetown University. His research interests include verifiable computing, complexity theory, and algorithms for massive data sets.

***

The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investment-list/.

The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures/ for additional important information.