SNARK Security and Performance

9.13.22

Quantitative security measures
And qualitative ones
Polynomial commitments
Concrete verifier costs and security in deployed SNARKs
Verifier costs
Concrete security
How many hash evaluations are feasible today?
Proposed actions for a healthy ecosystem
An aside: Diving even deeper into PQ-SNARK security
A proposal for mitigating the tension between security and performance

While you’re reading hover Bold⁽¹⁾ text for more info.

Quantitative security measures
And qualitative ones
Polynomial commitments
Concrete verifier costs and security in deployed SNARKs
Verifier costs
Concrete security
How many hash evaluations are feasible today?
Proposed actions for a healthy ecosystem
An aside: Diving even deeper into PQ-SNARK security
A proposal for mitigating the tension between security and performance

While you’re reading hover Bold⁽¹⁾ text for more info.

A SNARK (Succinct Non-interactive Argument of Knowledge) is a cryptographic tool that opens up exciting new possibilities in system design, especially in decentralized settings. SNARKs allow data to be processed by untrusted entities, who can then prove that the data is valid and has been processed correctly. A naive way to prove this would be to publish the data, allowing anyone who wishes to check its validity and process it directly.

A SNARK achieves the same effect, but with better costs to the validators. For example, in a layer-two (L2) verifiable rollup, a rollup service processes blockchain transactions. The service periodically posts its users’ account balances to the layer-one blockchain. Every time it posts an update to the balances, it appends a SNARK proof that it knows a sequence of valid transactions that changed the old account balances to the updated ones. In this way, blockchain validators do not have to do the hard work of checking transaction validity (e.g., checking a digital signature for each transaction) nor explicitly process the transactions to compute the updated balances.

My previous post focused on the performance of SNARK provers – the primary determinant of SNARK applicability. While running a SNARK prover can be computationally intensive, to the extent that it may be infeasible to generate a proof for large-scale computations, SNARK verification is typically far cheaper than directly checking and processing data. However, SNARK verification costs do vary considerably. This post unpacks these costs and compares the security properties of different SNARKs.

In particular, I explain that in practical SNARKs with plausible post-quantum security (PQ-SNARKs), there is a significant tension between security and verification costs. Moreover, in some cases, this tension is currently being resolved in favor of verification costs over security.For SNARKs to realize their potential, deployed systems must be secure and users must be confident in their security. I conclude the post by proposing simple actions that the web3 community can take to help ensure these properties hold for the long term.

Quantitative security measures

A SNARK is secure if it is computationally infeasible to produce a convincing proof of a false statement. For example, in the context of L2 rollups, an attacker wishing to steal my funds would want to prove a false statement of the form: “I know a digitally signed transaction that transfers all of Justin’s assets to my own account.”

The security level of a SNARK is measured by the amount of work that must be done to find a convincing proof of a false statement. Similar to other cryptographic primitives like digital signatures, the logarithm of this amount of work is referred to as the “bits of security.” For example, 30 bits of security implies that 2³⁰ ≈ 1 billion “steps of work” are required to attack the SNARK. This is inherently an approximate measure of real-world security because the notion of one step of work can vary, and practical considerations like memory requirements or opportunities for parallelism are not considered.

And qualitative ones

Bits of security is a quantitative measure of the security of a SNARK. SNARKs also differ in their qualitative security properties.

For example, some SNARKs require a trusted setup ceremony to generate a structured proving key. SNARKs that don’t require a trusted setup at all are called transparent.

For a non-transparent SNARK to be secure, typically at least one participant in the ceremony has to have behaved honestly, meaning they produced and then discarded a “trapdoor” that, if combined with the trapdoors of all other participants, would make it easy to find convincing SNARK “proofs” of any false statement. Trusted setups are being run with hundreds or thousands of participants to render this assumption as mild as possible.

SNARKs also differ in their susceptibility to quantum attacks. Specifically, many SNARKs (e.g., Groth16, PlonK, Marlin, Bulletproofs, Nova) rely on the assumption that discrete logarithms are difficult to compute (and in some cases, additional assumptions as well). Computing discrete logarithms is something that quantum computers would be able to do efficiently. Hence, these SNARKs are not post-quantum secure (non-PQ).

While urgent efforts are underway to switch to post-quantum encryption schemes, this is primarily driven by the need to keep encrypted messages secret many decades into the future. An adversary that stores an intercepted message today and waits for a quantum computer to arrive in, say, fifty years, can use the computer to decrypt the fifty-year-old message. The situation with SNARKs is quite different. The use of non-PQ SNARKs today should not compromise the security of blockchains fifty years in the future, even if quantum computers do arrive by that time.

Polynomial commitments

All SNARKs make use of a cryptographic primitive known as a polynomial commitment scheme, and this component is often a performance bottleneck. (For details, see my previous post.) In addition, a SNARK’s transparency and plausible post-quantum security are determined solely by the polynomial commitment scheme it uses.

One prominent example is so-called KZG polynomial commitments, which are used by PlonK, Marlin, and many others. KZG commitments are neither transparent nor post-quantum secure. Other commitment schemes are transparent but not post-quantum, including Bulletproofs, Hyrax, and Dory.

Still other schemes are both transparent and plausibly post-quantum. These include FRI, Ligero, Ligero++, Brakedown, and Orion. All of these schemes are based on error-correcting codes. Hashing is the only cryptographic primitive they use. They also share the following property: verification costs (measured by the number of hash evaluations and field operations) grow linearly with the number of bits of security.Roughly speaking, a single “iteration” of these post-quantum polynomial commitments achieves only a small number (say 2-4) bits of security. So the protocol must be repeated many times to “amplify” the security level, with verification costs growing with each repetition. Thus, to control verification costs in PQ-SNARKs (and hence gas costs in blockchain applications) protocol designers are incentivized to keep the security level low.

With rare exceptions, this tension between concrete security and verification costs does not arise in non-PQ SNARKs (transparent and non-transparent alike). Non-PQ-SNARKs use elliptic curve groups in which discrete logarithms are presumed to be hard to compute, and their security levels are determined by the group used. The size of the appropriate elliptic curve group, and hence the cost of each group operation, grow with the desired security level. However, the number of group elements in a proof are independent of the choice of group.

In PQ-SNARKs, not only does the size of hash evaluations grow linearly with the desired security level, so does the number of evaluations included in the proof and performed by the verifier.

Concrete verifier costs and security in deployed SNARKs

The concrete verifier costs and security levels of deployed SNARKs vary considerably. Here are some illustrative examples:

Verifier costs

Myprevious post mentioned two examples of concrete verification costs:PlonK proofs costunder 300,000 gas to verify on Ethereum, while StarkWare’s proofs cost about 5 million gas. StarkWare’s proofs are transparent and plausibly post-quantum while PlonK’s are not. However, as detailed next, StarkWare’s proofs are being run at a much lower security level than PlonK proofs on Ethereum.

Concrete security

The only elliptic curve with Ethereum precompiles is called altbn128, so this is the curve all non-PQ SNARKs deployed on Ethereum use, including Aztec’s and zkSync’s. This curve — and hence also the deployed SNARKs that use it — was originally believed to offer roughly 128 bits of security.

But due to improved attacks (the precise runtime of which is difficult to determine), the curve is now widely regarded to offer 100 to 110 bits of security. There are EIPs under consideration to introduce precompiles for different curves that are still believed to offer close to 128 bits of security. Such curves are already used in the SNARKs of non-Ethereum projects including ZCash, Algorand, Dfinity, Filecoin, and Aleo.

In contrast, according to on-chain data, StarkWare’s PQ-SNARKs (in its so-called SHARP system, short for SHARed Prover) have been configured to target 80 bits of security. This security level holds under certain conjectures about the statistical soundness of FRI (which I’ll address later in this post).

The term “80 bits of security” is vague in this context, so let me unpack it. Roughly speaking, it means that an attacker can produce a convincing proof of a false statement by evaluating a hash function 2⁸⁰ times (namely KECCAK-256, the hash function used by Ethereum). To be more precise, an attacker that is willing to perform 2^k hash evaluations can produce a convincing proof with a probability of roughly 2^k-80. For example, with 2⁷⁰ hash evaluations, one can find a SNARK “proof” of a false statement with a probability of about 2^-10 = 1/1024.

This notion is weaker than what the term “80 bits of security” means in other contexts. For example, a collision-resistant hash function (CRHFs) configured to 80 bits of security would produce 160-bit outputs. If the hash function is well-designed, the fastest collision-finding procedure will be the Birthday attack, a brute-force procedure that can find a collision with about 2^160/2 = 2⁸⁰ hash evaluations. However, with 2^k hash evaluations, the Birthday attack will find a collision with a probability of only 2^2k-160.

For example, 2⁷⁰ hash evaluations will yield a collision with a probability of 2^-20 ≈ 1/1,000,000. This is far smaller than the 1/1,000 probability of an attacker forging a PQ-SNARK proof configured to 80 bits of security.

This difference can drastically alter the attractiveness of performing an attack. To illustrate, imagine an attacker has a budget of $100K, which can buy 2⁷⁰ hash evaluations. And suppose that should the attacker succeed, the payoff is $200 million. The expected value of the attack against a PQ-SNARK configured to 80 bits of security is (1/1,000) * $200 million, or $200K – twice the cost. If the success probability were only 1/1,000,000, as with a CRHF, the expected value of the attack would be just $200.

The two notions of “80 bits of security” also differ dramatically in their robustness to independent attacks. For example, suppose 1,000 independent parties each attack the PQ-SNARK by performing 2⁷⁰ hash evaluations. Since each succeeds with a probability of about 1/1,000, at least one of them is quite likely to succeed. This would not be the case if each succeeded with a probability of only 1/1,000,000.

Well-designed elliptic curve groups are expected to behave similarly to CRHFs, with birthday-like attacks such as Pollard’s rho algorithm being the best known. This means that in a group offering 128 bits of security, 2^k group operations should yield a solution to an instance of the discrete logarithm problem with a probability of only 2^2k-256. For example, 2⁷⁰ group operations would find a discrete logarithm with only an astronomically small probability of 2^-116. Moreover, each group operation is slower than a CRHF evaluation.

How many hash evaluations are feasible today?

In January 2020, the cost of computing just shy of 2⁶⁴ SHA-1 evaluations using GPUs was $45,000. This puts the cost of 2⁷⁰ SHA-1 evaluations on GPUs at a few million dollars (perhaps lower after the Ethereum merge, as the transition away from GPU-dominated proof of work mining will likely decrease demand for GPU computing, lowering its cost).

With validity rollups already storing hundreds of millions of dollars in user funds, finding a convincing proof of a false statement can yield many millions of dollars in profit. Configuring a PQ-SNARK at 80 bits of security under aggressive conjectures leaves under 10 bits of wiggle room between profitable attacks and the conjectured security of the PQ-SNARK.

As another data point, Bitcoin’s network hash rate is currently about 2⁶⁷ hash evaluations per second, meaning bitcoin miners as a whole perform 2⁸⁰ SHA-256 evaluations every 2 hours. Of course, this eye-popping number is due to the vast investment in ASICs for bitcoin mining.

Assuming six bitcoin blocks created per hour, and given the current block reward of 6.25 Bitcoin per block, these 2⁸⁰ SHA-256 evaluations presumably cost less than $22,000 * 2 * 6 * 6.25 ≈ $1.6 million. Otherwise, bitcoin mining would not be profitable at current prices.

Proposed actions for a healthy ecosystem

The whole point of using SNARKs in rollups is to achieve blockchain scalability without users needing to trust the rollup service blindly. Since the rollup service wrote the smart contract that functions as the SNARK verifier, the public must be able to inspect the contract and confirm that it is indeed verifying SNARK proofs of the appropriate statements. Public scrutiny of the smart contract may also be necessary to ensure that the rollup service is not able to censor its own users. Proposed methods for censorship-resistance require the rollup’s smart contract to allow users to force the withdrawal of their funds if the rollup service fails to process their transactions.

Given the sophisticated nature of these protocols, this paradigm of public scrutiny places some burden on experts to openly and candidly discuss the security of the deployed contracts.

But with PQ-SNARKs, it can be difficult even for experts to ascertain the deployed protocol’s security level for the same reason that security and verifier performance are in tension for these SNARKs: the security level (and verifier costs) depend on internal parameters of the SNARK. And at least for StarkWare’s smart contracts, these parameters, called logBlowupFactor, proofOfWorkBits, and n_Queries, are not directly specified in the smart contract’s code but rather passed to the smart contract as public data. As far as I know, the easiest way to identify these parameters from on-chain information is via a four-step process:

view the appropriate smart contract on a block explorer such as Etherscan,
click on an appropriate “verify proof” transaction,
decode the transaction’s input data, and
figure out how to interpret the decoded input data to learn the key SNARK parameters that together determine the security level.

This final step requires finding the appropriate Solidity code implementing the transaction, which itself may be a confusing task. (When I was preparing survey talks on rollups this summer, Etherscan was able to decode the relevant input data to the SHARP verifier transactions as per Step 2 above. But that no longer appears to be working.)

Proposal 1: Scrutiny

With this in mind, my first suggestion to the web3 community is to make it far easier to scrutinize the security level of deployed post-quantum SNARKs. This likely involves better tooling for understanding on-chain data and increased transparency by the projects themselves in making these parameters known.

Proposal 2: Stronger norms

80 bits of security is too low, especially the weak version in which 2⁷⁰ hash evaluations are enough to successfully attack with a probability of about 1/1000 — even more so given the long history of surprising attacks on cryptographic primitives. One, mentioned above, is better attacks on pairing-friendly elliptic curves such as altbn128. A more famous example is SHA-1, which was standardized at 80 bits of security but was eventually shown to achieve only about 60 bits. With this history in mind, PQ-SNARK designers should leave themselves more than 10 bits of wiggle room when configuring the security level, especially if the security analysis involves strong conjectures about the statistical security of relatively new protocols such as FRI.

Even for PQ-SNARKs, appropriate security levels can always be achieved, simply by increasing verification costs. For example, increasing the security of SHARP’s verifier from 80 to 128 bits (under conjectures about FRI’s statistical soundness) would increase gas costs by roughly a factor of two, from a little over 5 million to a little over 10 million. Without conjectures about FRI, gas costs would increase by another factor of two, to over 20 million.

My next suggestion, then, is that the web3 community should develop stronger norms around appropriate security levels for deployed SNARKs. My personal recommendation would be at least 100 bits, and at least 128 if the SNARK’s security is based on non-standard assumptions. I am not the first to make such a proposal.

Here, I am willing to categorize as a “standard assumption” unconditional security in the random oracle model, if the random oracle in the deployed SNARK is instantiated with a standard hash function such as KECCAK-256. The random oracle model is an idealized setting that is meant to capture the behavior of well-designed cryptographic hash functions. It is often used to analyze the security of PQ-SNARKs.

An example of non-standard assumptions is conjectures (described below) regarding the quantitative soundness of a protocol such as FRI, either in an interactive setting or as a non-interactive protocol in the random oracle model.

SNARK designers are innovating in many exciting ways, some of which may push the envelope in terms of security – for example, by using SNARK-friendly hash functions that have not been subjected to as much cryptanalysis as more standard hash functions. I am not calling for an end to such efforts – far from it. But these primitives should be instantiated at security levels that exceed known, feasible attacks by well over 10 bits.

Rollups using SNARKs are commonly described as inheriting the security of their L1. But this is a difficult case to make if they are configured at much lower security levels than any cryptographic primitives used by the L1. As SNARKs see increasing adoption, we should make sure we deploy them in ways that are consistent with how we talk about them. So long as SNARKs are analyzed carefully, configured appropriately, and implemented correctly, they are as secure as any cryptographic primitive in use today.

An aside: Diving even deeper into PQ-SNARK security

The 80 bits of security in StarkWare’s PQ-SNARKs are accounted for as follows. These PQ-SNARKs make use of a polynomial commitment scheme called FRI, and StarkWare’s SHARP verifier runs FRI at 48 bits of security under a conjecture. Roughly speaking, the conjecture is that a simple attack on the soundness of FRI is optimal. In this attack, a cheating prover, with a small amount of effort, generates a “FRI proof” that passes the verifier’s randomly chosen checks with probability 2^-48.

StarkWare combines FRI with a technique called “grinding”. This requires the honest prover to produce a 32-bit proof of work before producing a FRI proof. The benefit of grinding is that it drastically increases the work required for a cheating prover to carry out the simple attack on FRI mentioned above, to about 2³² hash evaluations.

Since (in expectation) 2⁴⁸ attempts of the simple attack are necessary before one of them is successful, the total amount of work the cheating prover has to do to forge a FRI proof is roughly 2⁴⁸ * 2³² = 2⁸⁰ hash evaluations.

Finally, let us unpack how the 48 bits of security for FRI arise. The FRI verifier makes “queries” to the committed polynomial. FRI verification costs grow linearly with the number of queries. For example, 36 FRI verifier queries are roughly 4 times as expensive as 9 queries. But more queries mean more security. The number of “security bits per query” depends on another parameter of FRI, called the code rate.

Specifically, FRI uses something called the Reed-Solomon code of rate r, where r is always a number strictly between 0 and 1. The FRI prover’s costs are inversely proportional to r, so a rate of 1/16 leads to a prover that is about four times slower and more space-intensive than a rate of ¼.

The SHARP verifier has been running FRI with a code rate of 1/16 and with 12 verifier queries.StarkWare conjectures that each FRI verifier query adds log₂(1/r)bits of security. Under this conjecture, 12 verifier queries yields 12 * log₂(16) = 48 bits of security.However, current analyses only establish that each verifier query adds slightly less than log₂(1/r¹^/2) bits of security. So 12 FRI verifier queries only yield less than 12 * log₂(4) = 24 bits of “provable” security.

A proposal for mitigating the tension between security and performance

Practical PQ-SNARKs have verifier costs that grow linearly with the desired number of bits of security. One promising technique for mitigating this tension is SNARK composition — which I described in my previous post as a means to resolve tension between prover and verifier costs, but it can also address security.

Two examples

Polygon Hermez is composing PQ-SNARKs with PlonK. The idea is that the prover first generates a PQ-SNARK proof π. If the PQ-SNARK is configured to have a fast prover and an adequate security level, then π will be large. So the prover does not send π to the verifier. Instead, it uses PlonK to prove that it knows π.

This means applying PlonK to a circuit that takes π as input and checks that the PQ-SNARK verifier would accept π. Since the PQ-SNARK has polylogarithmic verification cost, PlonK is applied to a small circuit, and hence the PlonK prover is fast. Since PlonK proofs are small and cheap to verify, verification costs are low.

Unfortunately, the use of PlonK destroys transparency and post-quantum security. One can instead consider using the PQ-SNARK itself in place of PlonK to prove knowledge of π (in fact the PQ-SNARK used by Polygon is self-composed in this manner).

In this second application of the PQ-SNARK, to prove knowledge of π, the system can be configured to achieve adequate security with reasonably-sized proofs, for example, by selecting a very small code rate for use in FRI. The key point is that, while this small code rate is bad for prover time, the second application of the PQ-SNARK is applied only to a small circuit, so the total prover time should still be small.

Our theoretical understanding of the security of composed SNARKs leaves much to be desired. However, there aren’t known attacks on them that are faster than attacking one of the constituent SNARKs individually. For example, if composing a PQ-SNARK with PlonK, we do not know a better attack than to either attack the PQ-SNARK (i.e., find a PQ-SNARK proof π of a false statement), or to attack PlonK (i.e., find a PlonK proof of the false statement “I know a PQ-SNARK proof π that the verifier would have accepted.”)

Composing SNARKs in this manner is an increasingly popular way to improve performance. I hope that protocol designers also use it to improve security.

***

Justin Thaler is an Associate Professor at Georgetown University. Before joining Georgetown, he spent two years as a Research Scientist at Yahoo Labs in New York, before which he was a Research Fellow at the Simons Institute for the Theory of Computing at UC Berkeley.

Editor: Tim Sullivan @tim_org

***
The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information