Table of contents
- How encrypted mempools work
- Technical challenges for encrypted mempools
- Economic challenges for encrypted mempools
- Other efficiency challenges
The value that can be extracted by including, excluding, or reordering transactions in a block is known as “maximal extractable value, or MEV. MEV is common on most blockchains, and has been a topic of widespread interest and discussion in the community.
Note: This blog post assumes basic familiarity with MEV. Some readers might wish to start by reading our MEV explainer.
Many researchers, observing the MEV situation, have asked an obvious question: Can cryptography solve the problem? One approach is an encrypted mempool, where users broadcast encrypted transactions that are only revealed after they are sequenced. Hence, a consensus protocol would have to commit to a transaction ordering blindly, seemingly preventing the ability to take advantage of MEV opportunities during the ordering process.
Unfortunately, for both practical and theoretical reasons, we don’t see encrypted mempools providing a universal solution to MEV. We outline the difficulties and explore how encrypted mempools might be designed.
There have been many proposals for encrypted mempools, but the general framework for encrypted mempools is as follows:
Note that step 3 (transaction decryption) poses an important challenge: Who decrypts, and what if the decryption doesn’t happen? A naive solution would be to say that users themselves decrypt their transactions (in this case one wouldn’t even need encryption, but could simply hide commitments). However, this approach is vulnerable: An attacker can perform speculative MEV.
With speculative MEV, an attacker tries to guess that a certain encrypted transaction yields some MEV. They encrypt their own transactions which hopefully will appear in an opportune place (e.g., right before or after a target transaction). If the transaction is sequenced in the hoped-for order, the attacker decrypts and their transaction extracts the MEV. If not, they decline to decrypt and their transaction is not included in the final chain.
Perhaps users could face some penalty for failing to decrypt, but this is tricky to implement. The penalty would need to be the same for all encrypted transactions (since they are encrypted and therefore indistinguishable), but also large enough to discourage speculative MEV even against high-value targets. This would require tying up lots of capital, which should be anonymous (to avoid leaking information about which transactions are posted by which users). And it would end up costing honest users in the case of a bug or network failure which prevents their legitimate attempt to decrypt.
Hence, most proposals suggest that transactions be encrypted in such a way that decryption is guaranteed to be possible at some point in the future — even if the posting user goes offline or doesn’t cooperate. This can be achieved in several ways:
TEEs: Users can encrypt their transactions to a key held by a Trusted Execution Environment (TEE) enclave. In some simple versions, the TEE is only used to decrypt transactions after a certain time deadline (which requires some notion of time within the TEE). More advanced approaches use the TEE to decrypt transactions and build the block, ordering them based on some criteria like arrival times or fees. An advantage of TEEs compared to other encrypted mempool approaches is their ability to operate on plaintext transactions, thereby reducing onchain spam by filtering out transactions that would revert. However, this method requires trust in the hardware.
Secret-sharing and threshold encryption: With this approach, users encrypt transactions to a key that is shared by some committee, typically a subset of validators. Some threshold (e.g., two-thirds of the committee) is required for decryption.
The simplest approach uses a new shared decryption key in each round (e.g., each block or epoch on Ethereum), which the committee re-constructs and publishes after transactions are ordered in a finalized block. This approach can use simple secret sharing. The downside is that this reveals all transactions from the mempool, even those that were not included in a block. This approach also requires a new distributed key generation (DKG) in each round.
A better approach for privacy is to decrypt only the transactions that were actually included. This can be realized using threshold decryption. This approach also enables amortizing the cost of the DKG protocols but using the same key for multiple blocks with the committee threshold-decrypting transactions after they are ordered in a finalized block. A challenge is that the committee has to do a lot more work. Naively, the committee’s work is linear in the number of transactions to decrypt, although recent work suggests batch threshold decryption which can improve on this significantly.
With threshold decryption, trust shifts from a piece of hardware to a committee. The claim is that, since most protocols already make an honest majority assumption about validators for the consensus protocol, we can make a similar assumption that a majority of validators are honest and won’t decrypt transactions early. A note of caution is in order, however: These are not the same trust assumption. Consensus failures like forking the chain are publicly visible (called a weak trust assumption), whereas a malicious committee privately decrypting transactions early will generate no public evidence and hence such an attack cannot be detected or slashed (a strong trust assumption). Thus while, from the outside, the security assumptions for consensus and an encryption committee might look the same, practical considerations make the assumption that a committee won’t collude more tenuous.
Time-lock and delay encryption: An alternative to threshold encryption comes in the form of delay encryption. Users encrypt their transactions to a public key whose secret key is hidden within a time-locked puzzle. A time-locked puzzle is a cryptographic puzzle that encapsulates a secret, which can only be revealed after some predetermined amount of time has elapsed – more specifically, the puzzle can be decrypted by repeatedly performing some non-parallelizable computation. In this case, this puzzle can be opened by anybody to recover the key and decrypt transactions, but only after a slow (inherently sequential) computation which is designed to take long enough that transactions cannot be decrypted before they have been finalized. The strongest version of this primitive uses delay encryption to publicly generate such a puzzle. It can also be approximated by using a trusted committee to compute the puzzle via time-lock encryption, although at that point the advantages over threshold encryption are questionable.
Whether by delay encryption or computation by a trusted committee, there are a number of practical challenges. First, it is more difficult to ensure precise timing of decryption, since the delay is computational in nature. Second, these schemes rely on some entity running sophisticated hardware to solve the puzzles efficiently. While anybody can fulfill this role, it is unclear how to incentivize this party. Finally, in these designs, all broadcast transactions will be decrypted, including those which were never finalized in a block. Threshold-based (or witness encryption) solutions could potentially only decrypt transactions that are successfully included.
Witness encryption. Finally, the most advanced cryptographic approach uses a tool called witness encryption. Theoretically, witness encryption allows encrypting information to anybody who knows the witness to a specific NP relation. For example, one could encrypt such that anybody with the solution to a certain Sudoku puzzle, or anybody with a hash preimage of a certain value, can decrypt.
SNARKs are also possible for any NP relation. One can think of witness encryption as encrypting data to anybody who can compute a SNARK proving a desired condition. For encrypted mempools, one example for such a condition would be that transactions can only be decrypted when a block has been finalized.
This is a very powerful theoretical primitive. In fact, it is a generalization for which committee-based approaches and delay-based approaches are specific cases. Unfortunately, we don’t have any practical constructions of witness-based encryption. Furthermore, even if we did, it is not clear how it is an improvement over a committee-based approach for proof-of-stake chains. Even if witness encryption is set up so that transactions can only be decrypted once they are ordered in a finalized block, a malicious committee can still privately simulate the consensus protocol such that the transaction is finalized, and then use this private chain as a witness to decrypt transactions. At that point, using threshold decryption by the same committee provides equivalent security and is much simpler.
Witness encryption offers a more decisive edge in proof-of-work consensus protocols, since even a fully malicious committee can’t privately mine several new blocks on top of the current head to simulate finality.
Several important practical challenges limit encrypted mempools’ ability to prevent MEV. In general, keeping information confidential is hard. Interestingly, encryption is a tool rarely used in the web3 space. But we have decades of practical experience that showcase the challenges to deploying encryption on the web (TLS/HTTPS) and for private communication (from PGP to modern encrypted messaging platforms like Signal or Whatsapp). While encryption is a tool to preserve confidentiality, it does not guarantee it.
First, there might be parties with direct access to the plaintext of a user’s transaction. In typical cases, users may not encrypt their own transaction but outsource this to their wallet provider. Consequently, the wallet provider has access to the plaintext transaction and could use or sell this information to extract MEV. Encryption is never stronger than the set of parties with access to the key.
Beyond this, the biggest problem is metadata, that is, data surrounding the encrypted payload (transaction), which is not encrypted. Searchers can use this metadata to guess the intent of a transaction and then attempt speculative MEV. Remember, searchers don’t have to fully understand a transaction, or be right every time. It’s enough if they know, for example, that with some reasonable probability a transaction represents a buy order from a specific DEX.
We can consider several types of metadata, some of which are classic challenges with encryption and some of which are unique for encrypted mempools.
Sophisticated searchers might use any combination of the above metadata types to predict a transaction’s contents.
All of this information could potentially be hidden, but at a performance and complexity cost. For example, padding transactions up to a standard limit hides transaction size, but it wastes bandwidth and space onchain. Adding delays before sending messages hides transaction time, but harms latency. Submitting transactions over an anonymity network like Tor could hide the sender’s IP address, but this has its own challenges.
The most difficult metadata to hide is transaction fee data. Encrypting this data creates a number of challenges for the builder. The first problem is spam. If transaction fee payment data is encrypted, then anybody can broadcast malformed encrypted transactions that will be ordered but have no ability to pay fees. Hence, after decryption they will not be able to execute but no party can be penalized. This could possibly be addressed with SNARKs that prove that a transaction is well-formed and funded, but this would greatly increase overhead.
The second problem is efficient block building and fee auctions. Builders use fees to build the most profitable block and establish the current market price for onchain resources. Encrypting this data disrupts this process. This could be addressed by establishing a flat fee in each block, but this is economically inefficient and might encourage secondary markets for transaction inclusion that would undermine the point of having an encrypted mempool. Conducting a fee auction using secure multi-party computation or trusted hardware is possible, but these are both expensive steps.
Finally, secure, encrypted mempools impose overhead on the system in a number of ways. Encryption adds latency, computational and bandwidth overhead to the chain. How to combine it with support for sharding or parallel execution — which are important future goals — is far from obvious. It might add additional points of failure for liveness (e.g., the decryption committee for threshold solutions or delay function solver). And it certainly adds to design and implementation complexity.
Many of the problems of encrypted mempools are shared by blockchains that themselves aim to ensure transaction privacy (e.g., Zcash, Monero). If there is a silver lining, it is that solving all of the challenges of encryption for MEV abatement would as a side effect clear the way for transaction privacy.
Finally, encrypted mempools face economic challenges. Unlike the technical challenges, which could potentially be mitigated with enough engineering effort, these are fundamental limitations that appear difficult to solve.
The basic problem of MEV results from information asymmetry between users creating transactions, and the searchers and builders finding MEV opportunities. Users typically don’t know how much MEV can be extracted from their transactions. As a result, even with a perfect encrypted mempool, users might be induced to give up their decryption keys in exchange for a payment from builders that is less than the value extracted. We can call this incentivized decryption.
It’s not hard to imagine how this would look because a version of it, called MEV Share, exists today. MEV Share is an order flow auction that allows users to selectively submit information about their transactions to a pool where searchers compete for the chance to exploit the MEV opportunity presented by the transaction. The searcher with the winning bid extracts the MEV and refunds part of their profit (i.e., the bid or a fraction of the bid) to the user.
This model could be immediately adapted within an encrypted mempool space, requiring users to reveal their decryption keys (or possibly just some partial information) to participate. But most users won’t understand the opportunity cost of participating in such a scheme, only seeing the rewards coming back to them and being happy to give up their information. There are also examples from traditional finance (e.g., zero-fee trading services like Robinhood) that profit from selling their users’ order flow to third parties in a so-called “payment-for-order-flow” business model.
Other possible scenarios include large builders forcing users to reveal their transactions (or some information about them) for censorship reasons. Censorship resilience is an important and controversial topic within web3, but if large proposers and/or builders are legally obligated to enforce a censorship list (e.g., by OFAC), they may refuse to sequence any encrypted transactions. It might be possible to solve this problem technically, if users can produce a zero-knowledge proof that their encrypted transaction complies with the censorship list, but this will also add cost and complexity. Even if the chain has strong censorship resistance, where encrypted transactions are guaranteed to be included, block builders might still prioritize putting transactions they know in plaintext at the top of the block and demoting any encrypted transactions to the bottom of the block. Thus, transactions that desire certain execution guarantees might be forced to reveal their contents to builders anyways.
Encrypted mempools add overhead to the system in a few obvious ways. Users must encrypt transactions and the system must somehow decrypt them. This adds computational cost and possibly increases transaction size. As discussed above, dealing with metadata can make these overheads worse. However, some other efficiency costs are less obvious. In finance, a market is considered to be efficient if the price reflects all available information, and inefficiencies arise from delays and information asymmetries — natural consequences of encrypted mempools.
One of the consequences of these inefficiencies is increased price uncertainty, the result of the additional delay encrypted mempools introduce. Thus, the number of trade failures due to exceeding the price slippage tolerance are likely to increase and waste chain space.
Similarly, this price uncertainty might also lead to speculative MEV transactions that attempt to profit from onchain arbitrage. Importantly, these opportunities might be more widespread with encrypted mempools because the increased uncertainty around the current state of DEXs, in light of the delayed execution, is likely to produce less efficient markets with price discrepancies across venues. These types of speculative MEV transactions would also waste block space because they will often abort if no such opportunities are found.
***
While our goal here was to outline the challenges in encrypted mempools, so that people can focus on building and solving things in other ways, they may yet be a part of the solution to MEV.
One possible solution is hybrid designs, where some transactions are blind-ordered via an encrypted mempool and some are ordered via another solution. For certain types of transactions — for example, buy/sell orders from large market participants who can carefully encrypt/pad them and are willing to pay more to avoid MEV — hybrid designs may be the right solution. These designs might also make sense for certainly highly sensitive transactions, such as bug fixes to a security contract with a vulnerability.
However, due to their technological limitations, as well as significant engineering complexity and performance overheads, encrypted mempools are unlikely to be the silver bullet solution to MEV that they are sometimes made out to be. The community will need to develop other solutions, including MEV auctions, application-layer defenses and minimizing finality time. MEV will be a challenge for some time to come, and careful investigation is needed to find the right balance of solutions to manage its downsides.
***
Pranav Garimidi is a Research Associate at a16z Crypto. He does research on problems in mechanism design and algorithmic game theory as it relates to blockchain systems. He is especially focused on how incentives interact across the blockchain stack. Prior to a16z, Pranav was a student at Columbia University where he graduated with a degree in Computer Science.
Joseph Bonneau is an Associate Professor in the Computer Science Department at the Courant Institute, New York University, and a technical adviser to a16z crypto. His research focuses on applied cryptography and blockchain security. He has taught cryptocurrency courses at the University of Melbourne, NYU, Stanford, and Princeton, and received a PhD in computer science from the University of Cambridge and BS/MS degrees from Stanford.
Lioba Heimbach is a fourth-year Ph.D. student advised by Prof. Dr. Roger Wattenhofer in the Distributed Computing (Disco) group at ETH Zurich. Her research centers around blockchain protocols with a focus on decentralized finance (DeFi). In particular, it focuses on enabling an accessible, decentralized, fair, and efficient blockchain and DeFi ecosystem. She was a research intern at a16z crypto during the summer of 2024.
***
The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.
This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.
The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.