a16z: Why is the encrypted memory pool not a universal remedy for MEV?

CN
16 hours ago

Authors: Pranav Garimidi, Joseph Bonneau, Lioba Heimbach, a16z

Translation: Saoirse, Foresight News

In the blockchain, the maximum extractable value, abbreviated as MEV, refers to the maximum value that can be earned by deciding which transactions to include in a block, which to exclude, or by adjusting the order of transactions. MEV is prevalent in most blockchains and has been a widely discussed topic in the industry.

Note: This article assumes that readers have a basic understanding of MEV. Some readers may first read our MEV Explainer .

Many researchers observing the MEV phenomenon have raised a clear question: Can cryptographic techniques solve this problem? One proposed solution is to use encrypted mempools: users broadcast encrypted transactions, which are only decrypted and revealed after sorting is completed. This way, the consensus protocol must "blindly select" the transaction order, which seems to prevent the exploitation of MEV opportunities during the sorting phase.

Unfortunately, both in practical application and theoretical terms, encrypted mempools cannot provide a universal solution to the MEV problem. This article will outline the challenges involved and explore feasible design directions for encrypted mempools.

How Encrypted Mempools Work

There have been many proposals regarding encrypted mempools, but their general framework is as follows:

  1. Users broadcast encrypted transactions.
  2. Encrypted transactions are submitted to the blockchain (in some proposals, transactions must first undergo verifiable random shuffling).
  3. Once the block containing these transactions is finally confirmed, the transactions are decrypted.
  4. Finally, these transactions are executed.

It is important to note that step 3 (transaction decryption) presents a critical issue: who is responsible for decryption? What happens if decryption fails? One simple idea is to let users decrypt their own transactions (in this case, encryption may not even be necessary, only hiding the commitment). However, this approach has vulnerabilities: attackers may implement speculative MEV.

In speculative MEV, attackers guess that a certain encrypted transaction contains MEV opportunities, then encrypt their own transactions and attempt to insert them into advantageous positions (e.g., before or after the target transaction). If the transactions are arranged in the expected order, the attacker will decrypt and extract MEV through their own transaction; if not, they will refuse to decrypt, and their transaction will not be included in the final blockchain.

Perhaps penalties could be imposed on users who fail to decrypt, but implementing this mechanism is extremely challenging. The reason is that the penalty for all encrypted transactions must be uniform (after all, transactions cannot be distinguished once encrypted), and the penalty must be severe enough to deter speculative MEV even in the face of high-value targets. This would lead to a large amount of funds being locked up, and these funds must remain anonymous (to avoid revealing the association between transactions and users). More troubling is that if a genuine user is unable to decrypt due to a software bug or network failure, they would also suffer losses.

Therefore, most proposals suggest that when encrypting transactions, it must be ensured that they can be decrypted at some point in the future, even if the transaction initiator is offline or refuses to cooperate. This goal can be achieved through several methods:

Trusted Execution Environments (TEEs): Users can encrypt transactions to keys held securely by a Trusted Execution Environment (TEE). In some basic versions, the TEE is only used to decrypt transactions after a specific time point (which requires the TEE to have time-awareness). More complex solutions allow the TEE to decrypt transactions and construct blocks, sorting transactions based on criteria such as arrival time and fees. Compared to other encrypted mempool solutions, the advantage of TEEs is that they can directly handle plaintext transactions, reducing on-chain redundant information by filtering out transactions that would roll back. However, the downside of this method is its reliance on hardware trustworthiness.

Secret Sharing and Threshold Encryption: In this scheme, users encrypt transactions to a key that is jointly held by a specific committee (usually a subset of validators). Decryption requires meeting certain threshold conditions (e.g., two-thirds of the committee members must agree).

When using threshold decryption, the trusted entity shifts from hardware to the committee. Proponents argue that since most protocols already assume that validators possess the "honest majority" characteristic in consensus mechanisms, we can make a similar assumption that the majority of validators will remain honest and will not decrypt transactions prematurely.

However, it is important to note a key distinction: these two trust assumptions are not the same concept. Consensus failures, such as blockchain forks, are publicly visible (belonging to a "weak trust assumption"), while a malicious committee privately decrypting transactions in advance leaves no public evidence; such attacks cannot be detected or punished (belonging to a "strong trust assumption"). Therefore, although at first glance the security assumptions of consensus mechanisms and encrypted committees seem consistent, in practice, the credibility of the assumption "the committee will not collude" is much lower.

Time-lock and Delay Encryption: As an alternative to threshold encryption, the principle of delay encryption is that users encrypt transactions to a public key, while the private key corresponding to that public key is hidden within a time-lock puzzle. A time-lock puzzle is a cryptographic puzzle that encapsulates a secret, which can only be revealed after a predetermined time, specifically requiring a series of sequential computations that cannot be parallelized. Under this mechanism, anyone can solve the puzzle to obtain the key and decrypt the transaction, but only after completing a lengthy (essentially serial) computation that ensures the transaction cannot be decrypted before final confirmation. The strongest form of this encryption primitive is to publicly generate such puzzles through delay encryption techniques; it can also be approximated through trusted committees using time-lock encryption, although its relative advantages over threshold encryption are debatable.

Whether using delay encryption or having a trusted committee perform computations, these schemes face many practical challenges: first, since the delay fundamentally relies on the computational process, it is difficult to ensure the precision of decryption timing; second, these schemes require specific entities to run high-performance hardware to efficiently solve the puzzles, although anyone can take on this role, how to incentivize that entity to participate remains unclear; finally, in such designs, all broadcast transactions will be decrypted, including those that were never ultimately written into blocks. In contrast, threshold (or witness encryption) schemes may only decrypt those transactions that are successfully included.

Witness Encryption: The final and most advanced cryptographic scheme is the use of "witness encryption" technology. Theoretically, the mechanism of witness encryption is that after encrypting information, only those who know the specific NP relation corresponding to the "witness information" can decrypt it. For example, information can be encrypted such that only someone who can solve a particular Sudoku puzzle or provide a specific hash preimage can complete the decryption.

(Note: An NP relation is the correspondence between a "problem" and an "answer that can be quickly verified.")

For any NP relation, similar logic can be implemented through SNARKs. It can be said that witness encryption essentially encrypts data in a form that can only be decrypted by entities that can prove through SNARK that they meet specific conditions. In the context of encrypted mempools, a typical example of such a condition is that transactions can only be decrypted after the block is finally confirmed.

This is a highly promising theoretical primitive. In practice, it is a universal solution, with both committee-based and delay-based methods being specific applications of it. Unfortunately, we currently do not have any practically implementable witness-based encryption schemes. Moreover, even if such schemes existed, it is difficult to say that they would be more advantageous than committee-based methods in proof-of-stake chains. Even if witness encryption is set to "only decrypt when the transaction is sorted in the final confirmed block," a malicious committee could still privately simulate the consensus protocol to fabricate the final confirmation state of the transaction, then use this private chain as "witness" to decrypt the transaction. At this point, using threshold decryption by the same committee would achieve equivalent security while being much simpler to operate.

However, in proof-of-work consensus protocols, the advantages of witness encryption are more pronounced. Because even if the committee is entirely malicious, they cannot privately mine multiple new blocks at the current blockchain head to fabricate the final confirmation state.

Technical Challenges Facing Encrypted Mempools

Several practical challenges limit the ability of encrypted mempools to prevent MEV. Overall, confidentiality of information itself is a challenge. It is worth noting that the application of cryptographic techniques in the Web3 space is not widespread, but decades of practice in deploying encryption technologies in networks (such as TLS/HTTPS) and private communications (from PGP to modern encrypted messaging platforms like Signal and WhatsApp) have fully exposed the difficulties involved: while encryption is a tool for protecting confidentiality, it cannot guarantee absolute security.

First, certain entities may directly access the plaintext information of user transactions. In typical scenarios, users usually do not encrypt transactions themselves but delegate this task to wallet service providers. As a result, wallet service providers can access the plaintext of transactions and may even exploit or sell this information to extract MEV. The security of encryption always depends on all entities that can access the keys. The scope of key control defines the boundary of security.

In addition, the biggest problem lies in the metadata, which is the unencrypted data surrounding the encrypted payload (transaction). Searchers can use this metadata to infer transaction intentions and implement speculative MEV. It is important to note that searchers do not need to fully understand the transaction content or guess correctly every time. For example, as long as they can reasonably determine that a transaction is a buy order from a specific decentralized exchange (DEX), it is sufficient to initiate an attack.

We can categorize metadata into several types: one type is a classic problem inherent to cryptographic techniques, while the other type is a problem unique to encrypted mempools.

  • Transaction Size: Encryption itself cannot hide the size of the plaintext (notably, the formal definition of semantic security explicitly excludes hiding plaintext size). This is a common attack vector in encrypted communications; a typical case is that even after encryption, an eavesdropper can determine in real-time what content is being played on Netflix by analyzing the size of each packet in the video stream. In an encrypted mempool, specific types of transactions may have unique sizes, thereby leaking information.
  • Broadcast Time: Encryption also cannot hide time information (this is another classic attack vector). In Web3 scenarios, certain senders (such as in structured sell-off situations) may initiate transactions at fixed intervals. The timing of transactions may also be associated with other information, such as activities on external exchanges or news events. A more covert way to exploit timing information is through arbitrage between centralized exchanges (CEX) and decentralized exchanges (DEX): sorters can exploit the latest CEX price information by inserting transactions created as late as possible; at the same time, sorters can exclude all other transactions broadcast after a certain point in time (even if encrypted), ensuring that their transactions alone benefit from the latest price advantage.
  • Source IP Address: Searchers can infer the identity of transaction senders by monitoring peer-to-peer networks and tracking source IP addresses. This issue was identified in the early days of Bitcoin (over a decade ago). If a specific sender has a fixed behavioral pattern, this is highly valuable to searchers. For example, knowing the sender's identity allows them to associate encrypted transactions with previously decrypted historical transactions.
  • Transaction Sender and Fee/Gas Information: Transaction fees are a type of metadata unique to encrypted mempools. In Ethereum, traditional transactions include the on-chain sender address (used to pay fees), maximum gas budget, and the unit gas fee the sender is willing to pay. Similar to the source network address, the sender address can be used to associate multiple transactions with real entities; the gas budget can imply transaction intent. For instance, interacting with a specific DEX may require a recognizable fixed amount of gas.

Sophisticated searchers may combine the above types of metadata to predict transaction content.

In theory, this information can all be hidden, but at the cost of performance and complexity. For example, padding transactions to a standard length can hide size but wastes bandwidth and on-chain space; adding delays before sending can hide time but increases latency; submitting transactions through anonymous networks like Tor can hide IP addresses, but this introduces new challenges.

The hardest metadata to hide is transaction fee information. Encrypting fee data presents a series of problems for block builders: first is the issue of garbage data; if transaction fee data is encrypted, anyone can broadcast incorrectly formatted encrypted transactions, which may be sorted but cannot pay fees, and once decrypted, cannot be executed without accountability. This could potentially be solved through SNARKs, which would prove that the transaction format is correct and that funds are sufficient, but this would significantly increase overhead.

Secondly, there is the efficiency issue of block building and fee auctions. Builders rely on fee information to create profit-maximizing blocks and determine the current market price of on-chain resources. Encrypted fee data would disrupt this process. One solution is to set a fixed fee for each block, but this is economically inefficient and could lead to a secondary market for transaction packaging, contradicting the design intent of encrypted mempools. Another solution is to conduct fee auctions through secure multi-party computation or trusted hardware, but both methods are extremely costly.

Finally, a secure encrypted mempool would increase system overhead in multiple ways: encryption would increase chain latency, computational load, and bandwidth consumption; how to integrate with important future goals like sharding or parallel execution remains unclear; it could also introduce new failure points for liveness (such as decryption committees in threshold schemes or delay function solvers); at the same time, design and implementation complexity would significantly rise.

Many of the issues facing encrypted mempools are similar to the challenges encountered by blockchains aimed at ensuring transaction privacy (such as Zcash and Monero). If there is any positive aspect, it is that solving all the challenges of cryptographic techniques in MEV mitigation will also clear obstacles for transaction privacy.

Economic Challenges Facing Encrypted Mempools

Finally, encrypted mempools also face economic challenges. Unlike technical challenges, which can be gradually alleviated through sufficient engineering investment, these economic challenges represent fundamental limitations that are extremely difficult to resolve.

The core issue of MEV arises from the information asymmetry between transaction creators (users) and MEV opportunity exploiters (searchers and block builders). Users are often unaware of how much extractable value is contained in their transactions, so even if a perfect encrypted mempool exists, they may still be induced to leak decryption keys in exchange for a reward that is below the actual MEV value; this phenomenon can be termed "incentivized decryption."

This scenario is not hard to imagine, as similar mechanisms like MEV Share already exist in reality. MEV Share is an order flow auction mechanism that allows users to selectively submit transaction information to a pool, where searchers compete for the right to exploit MEV opportunities from that transaction. The winning bidder, after extracting MEV, returns a portion of the profits (either the bid amount or a certain percentage) to the user.

This model can be directly adapted to encrypted mempools: users would need to disclose decryption keys (or partial information) to participate. However, most users are unaware of the opportunity cost of participating in such mechanisms; they only see the immediate rewards and are willing to leak information. There are similar cases in traditional finance: for example, the zero-commission trading platform Robinhood profits by selling user order flow to third parties through "payment-for-order-flow."

Another possible scenario is that large builders may force users to disclose transaction content (or related information) under the pretext of censorship. Censorship resistance is an important and controversial topic in the Web3 space, but if large validators or builders are legally bound (such as by the U.S. Office of Foreign Assets Control (OFAC) regulations) to enforce a sanctions list, they may refuse to process any encrypted transactions. Technically, users could potentially use zero-knowledge proofs to verify that their encrypted transactions comply with censorship requirements, but this would add extra costs and complexity. Even if the blockchain has strong censorship resistance (ensuring that encrypted transactions are always included), builders may still prioritize known plaintext transactions at the front of the block while placing encrypted transactions at the end. Therefore, transactions that need to ensure execution priority may ultimately be forced to disclose their content to builders.

Other Efficiency Challenges

Encrypted mempools will increase system overhead in various obvious ways. Users need to encrypt transactions, and the system must decrypt them in some manner, which increases computational costs and may also increase transaction size. As mentioned earlier, handling metadata will further exacerbate these overheads. However, there are also some efficiency costs that are less obvious. In finance, if prices can reflect all available information, the market is considered efficient; delays and information asymmetry lead to market inefficiencies. This is the inevitable result of encrypted mempools.

Such inefficiencies will lead to a direct consequence: increased price uncertainty, which is a direct product of the additional delays introduced by encrypted mempools. Consequently, the number of transactions failing due to exceeding price slippage tolerance may increase, wasting on-chain space.

Similarly, this price uncertainty may also give rise to speculative MEV transactions, which attempt to profit from on-chain arbitrage. Notably, encrypted mempools may make such opportunities more prevalent: due to execution delays, the current state of decentralized exchanges (DEX) becomes more ambiguous, likely leading to decreased market efficiency and price discrepancies between different trading platforms. Such speculative MEV transactions will also waste block space, as they often terminate execution once arbitrage opportunities are not found.

Conclusion

The purpose of this article is to outline the challenges facing encrypted mempools so that attention can be directed toward the development of other solutions, but encrypted mempools may still become part of the MEV governance solution.

One feasible idea is a hybrid design: some transactions could achieve "blind sorting" through encrypted mempools, while others could use different sorting schemes. For specific types of transactions (such as buy and sell orders from large market participants who can carefully encrypt or pad transactions and are willing to pay higher costs to avoid MEV), a hybrid design may be appropriate. This design also makes practical sense for highly sensitive transactions (such as repair transactions targeting vulnerable smart contracts).

However, due to technical limitations, high engineering complexity, and performance overhead, encrypted mempools are unlikely to become the "universal solution to MEV" that many hope for. The community needs to develop other solutions, including MEV auctions, application-layer defenses, and reducing final confirmation times. MEV will remain a challenge for the foreseeable future, requiring in-depth research to find a balance among various solutions to mitigate its negative impacts.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

抢600万美金!闯Bitget 2025 KCGI,夺西甲荣耀
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink