a16z: Why is the crypto memory pool not a universal remedy for MEV?

CN
16 hours ago

Original Title: On the limits of encrypted mempools

Original Authors: Pranav Garimidi, Joseph Bonneau, Lioba Heimbach, a16z

Original Compilation: Saoirse, Foresight News

In blockchain, the maximum extractable value (MEV) refers to the maximum value that can be earned by deciding which transactions to include in a block, which to exclude, or by adjusting the order of transactions. MEV is prevalent in most blockchains and has been a widely discussed topic in the industry.

Note: This article assumes that readers have a basic understanding of MEV. Some readers may first read our MEV explainer article.

Many researchers observing the MEV phenomenon have raised a clear question: Can cryptography solve this problem? One proposed solution is to use encrypted mempools: users broadcast encrypted transactions, which are only decrypted and revealed after sorting is completed. This way, the consensus protocol must "blindly select" the transaction order, which seems to prevent the exploitation of MEV opportunities during the sorting phase.

Unfortunately, both in practical application and theoretical terms, encrypted mempools cannot provide a universal solution to the MEV problem. This article will outline the difficulties involved and explore feasible design directions for encrypted mempools.

How Encrypted Mempools Work

There have been many proposals regarding encrypted mempools, but their general framework is as follows:

  1. Users broadcast encrypted transactions.

  2. Encrypted transactions are submitted to the blockchain (in some proposals, transactions must first undergo verifiable random shuffling).

  3. Once the block containing these transactions is finally confirmed, the transactions are decrypted.

  4. Finally, these transactions are executed.

It is important to note that step 3 (transaction decryption) presents a critical issue: who is responsible for decryption? What happens if decryption fails? One simple idea is to let users decrypt their own transactions (in this case, encryption is unnecessary; only a hidden commitment is needed). However, this approach has vulnerabilities: attackers may implement speculative MEV.

In speculative MEV, attackers guess that a certain encrypted transaction contains MEV opportunities, then encrypt their own transactions and attempt to insert them into a favorable position (e.g., before or after the target transaction). If the transactions are arranged in the expected order, the attacker will decrypt and extract MEV through their own transaction; if not, they will refuse to decrypt, and their transaction will not be included in the final blockchain.

Perhaps penalties could be imposed on users who fail to decrypt, but implementing this mechanism is extremely challenging. The reason is that the penalty for all encrypted transactions must be uniform (after all, transactions cannot be distinguished once encrypted), and the penalty must be severe enough to deter speculative MEV even when facing high-value targets. This would lead to a large amount of funds being locked up, and these funds must remain anonymous (to avoid revealing the association between transactions and users). More troubling is that if genuine users are unable to decrypt due to software bugs or network failures, they would also suffer losses.

Therefore, most proposals suggest that when encrypting transactions, it must be ensured that they can be decrypted at some point in the future, even if the transaction initiator is offline or refuses to cooperate. This goal can be achieved in several ways:

Trusted Execution Environments (TEEs): Users can encrypt transactions to keys held securely by a trusted execution environment (TEE). In some basic versions, the TEE is only used to decrypt transactions after a specific time point (which requires the TEE to have time-awareness). More complex solutions allow the TEE to be responsible for decrypting transactions and constructing blocks, sorting transactions based on criteria such as arrival time and fees. Compared to other encrypted mempool solutions, the advantage of TEE is that it can directly handle plaintext transactions, reducing on-chain redundant information by filtering out transactions that would roll back. However, this method's drawback is its reliance on hardware trustworthiness.

Secret-sharing and Threshold Encryption: In this solution, users encrypt transactions to a key that is jointly held by a specific committee (usually a subset of validators). Decryption requires meeting certain threshold conditions (e.g., two-thirds of the committee members must agree).

When using threshold decryption, the trusted entity shifts from hardware to the committee. Proponents argue that since most protocols already assume that validators possess the "honest majority" characteristic in consensus mechanisms, we can make a similar assumption that most validators will remain honest and will not decrypt transactions prematurely.

However, it is important to note a key distinction: these two trust assumptions are not the same concept. Consensus failures, such as blockchain forks, are publicly visible (belonging to a "weak trust assumption"), while malicious committees decrypting transactions privately leave no public evidence; such attacks cannot be detected or punished (belonging to a "strong trust assumption"). Therefore, although on the surface, the security assumptions of consensus mechanisms and encrypted committees seem consistent, in practice, the credibility of the assumption "the committee will not collude" is much lower.

Time-lock and Delay Encryption: As an alternative to threshold encryption, the principle of delay encryption is that users encrypt transactions to a public key, while the private key corresponding to that public key is hidden within a time-lock puzzle. A time-lock puzzle is a cryptographic puzzle that encapsulates a secret, which can only be revealed after a predetermined time, specifically requiring a series of sequential computations that cannot be parallelized. In this mechanism, anyone can solve the puzzle to obtain the key and decrypt the transaction, but only after completing a lengthy computation designed to be slow (essentially serial execution), ensuring that the transaction cannot be decrypted before final confirmation. The strongest form of this encryption primitive is to publicly generate such puzzles through delay encryption techniques; it can also be approximated through trusted committees using time-lock encryption, although its relative advantages over threshold encryption are debatable.

Whether using delay encryption or having a trusted committee perform computations, these solutions face many practical challenges: first, since the delay inherently relies on the computation process, it is difficult to ensure the precision of decryption timing; second, these solutions require specific entities to run high-performance hardware to efficiently solve the puzzles; although anyone can take on this role, how to incentivize that entity to participate remains unclear; finally, in such designs, all broadcast transactions will be decrypted, including those that were never ultimately written into blocks. In contrast, threshold (or witness encryption) solutions may only decrypt those transactions that are successfully included.

Witness Encryption: The final and most advanced cryptographic solution is to use "witness encryption" technology. Theoretically, the mechanism of witness encryption is that after encrypting information, only those who know the specific NP relation corresponding to the "witness information" can decrypt it. For example, information can be encrypted such that only someone who can solve a particular Sudoku puzzle or provide a specific hash preimage can complete the decryption.

(Note: An NP relation is the correspondence between a "problem" and an "answer that can be quickly verified.")

For any NP relation, similar logic can be implemented through SNARKs. It can be said that witness encryption essentially encrypts data in a form that can only be decrypted by entities that can prove through SNARK that they meet specific conditions. In the context of encrypted mempools, a typical example of such conditions is that transactions can only be decrypted after the block is finally confirmed.

This is a highly promising theoretical primitive. In practice, it is a universal solution, with both committee-based and delay-based methods being specific applications of it. Unfortunately, we currently do not have any practically implementable witness-based encryption solutions. Moreover, even if such solutions existed, it is difficult to say they would be more advantageous than committee-based methods in proof-of-stake chains. Even if witness encryption is set to "only decrypt when the transaction is sorted in the final confirmed block," a malicious committee could still privately simulate the consensus protocol to fabricate the final confirmation state of the transaction, then use this private chain as "witness" to decrypt the transaction. At this point, using threshold decryption by the same committee would achieve equivalent security while being much simpler to operate.

However, in proof-of-work consensus protocols, the advantages of witness encryption are more pronounced. Because even if the committee is entirely malicious, they cannot privately mine multiple new blocks at the current blockchain head to fabricate the final confirmation state.

Technical Challenges Facing Encrypted Mempools

Several practical challenges limit the ability of encrypted mempools to prevent MEV. Overall, information confidentiality itself is a challenge. It is worth noting that the application of cryptographic technology in the Web3 space is not widespread, but decades of practice deploying cryptographic technology in networks (such as TLS/HTTPS) and private communications (from PGP to modern encrypted messaging platforms like Signal and WhatsApp) have fully exposed the difficulties: while encryption is a tool for protecting confidentiality, it cannot provide absolute guarantees.

First, certain entities may directly access the plaintext information of user transactions. In typical scenarios, users usually do not encrypt transactions themselves but delegate this task to wallet service providers. As a result, wallet service providers can access the transaction plaintext and may even exploit or sell this information to extract MEV. The security of encryption always depends on all entities that can access the keys. The scope of key control defines the boundary of security.

In addition, the biggest problem lies in the metadata, which is the unencrypted data surrounding the encrypted payload (transaction). Searchers can use this metadata to infer transaction intentions and implement speculative MEV. It is important to note that searchers do not need to fully understand the transaction content or guess correctly every time. For example, as long as they can reasonably determine that a transaction is a buy order from a specific decentralized exchange (DEX), it is sufficient to initiate an attack.

We can categorize metadata into several types: one type is a classic problem inherent to cryptographic technology, while the other type is a problem unique to encrypted mempools.

· Transaction Size: Encryption itself cannot hide the size of the plaintext (notably, the formal definition of semantic security explicitly excludes hiding plaintext size). This is a common attack vector in encrypted communications; a typical case is that even after encryption, an eavesdropper can still determine what content is being played on Netflix in real-time by analyzing the size of each data packet in the video stream. In encrypted mempools, specific types of transactions may have unique sizes, thereby leaking information.

· Broadcast Time: Encryption also cannot hide time information (this is another classic attack vector). In Web3 scenarios, certain senders (such as in structured sell-off scenarios) may initiate transactions at fixed intervals. Transaction times may also be associated with other information, such as activities on external exchanges or news events. A more subtle way to exploit time information is through arbitrage between centralized exchanges (CEX) and decentralized exchanges (DEX): sorters can insert transactions created as late as possible to take advantage of the latest CEX price information; simultaneously, sorters can exclude all other transactions broadcast after a certain time (even if encrypted), ensuring that their transactions exclusively benefit from the latest price advantage.

· Source IP Address: Searchers can infer the identity of transaction senders by monitoring peer-to-peer networks and tracing source IP addresses. This issue was identified in the early days of Bitcoin (over a decade ago). If a specific sender has a fixed behavioral pattern, it can be highly valuable to searchers. For example, knowing the sender's identity allows them to associate encrypted transactions with previously decrypted historical transactions.

· Transaction Sender and Fee/Gas Information: Transaction fees are a type of metadata unique to encrypted mempools. In Ethereum, traditional transactions include the on-chain sender address (used for paying fees), the maximum gas budget, and the unit gas fee the sender is willing to pay. Similar to the source network address, the sender address can be used to associate multiple transactions with real entities; the gas budget can imply transaction intent. For instance, interacting with a specific DEX may require a recognizable fixed amount of gas.

Sophisticated searchers may combine various types of metadata mentioned above to predict transaction content.

Theoretically, this information can be hidden, but it comes at the cost of performance and complexity. For example, padding transactions to a standard length can hide their size but wastes bandwidth and on-chain space; adding delays before sending can hide time but increases latency; submitting transactions through anonymous networks like Tor can hide IP addresses, but this brings new challenges.

The hardest metadata to hide is transaction fee information. Encrypted fee data poses a series of problems for block builders: first is the issue of garbage data; if transaction fee data is encrypted, anyone can broadcast incorrectly formatted encrypted transactions, which may be sorted but cannot pay fees, resulting in transactions that cannot be executed after decryption with no accountability. This might be solvable through SNARKs, which prove that the transaction format is correct and funds are sufficient, but it would significantly increase overhead.

Secondly, there is the efficiency issue of block building and fee auctions. Builders rely on fee information to create profit-maximizing blocks and determine the current market price of on-chain resources. Encrypted fee data disrupts this process. One solution is to set a fixed fee for each block, but this is economically inefficient and may give rise to a secondary market for transaction packaging, contradicting the design intent of encrypted mempools. Another solution is to conduct fee auctions through secure multi-party computation or trusted hardware, but both methods are extremely costly.

Finally, a secure encrypted mempool would increase system overhead in multiple ways: encryption would increase chain latency, computational load, and bandwidth consumption; how to integrate with important future goals like sharding or parallel execution remains unclear; it may also introduce new failure points for liveness (such as decryption committees in threshold schemes or delay function solvers); at the same time, design and implementation complexity would significantly rise.

Many of the issues facing encrypted mempools are similar to the challenges encountered by blockchains aimed at ensuring transaction privacy (such as Zcash and Monero). If there is any positive aspect, it is that solving all the challenges of cryptographic technology in MEV mitigation will also clear obstacles for transaction privacy.

Economic Challenges Facing Encrypted Mempools

Finally, encrypted mempools also face economic challenges. Unlike technical challenges, which can be gradually alleviated with sufficient engineering investment, these economic challenges represent fundamental limitations that are extremely difficult to resolve.

The core issue of MEV arises from the information asymmetry between transaction creators (users) and MEV opportunity miners (searchers and block builders). Users are often unaware of how much extractable value is contained in their transactions, so even if a perfect encrypted mempool exists, they may still be induced to leak decryption keys in exchange for a reward that is below the actual MEV value; this phenomenon can be termed "incentivized decryption."

This scenario is not hard to imagine, as similar mechanisms like MEV Share already exist in reality. MEV Share is an order flow auction mechanism that allows users to selectively submit transaction information to a pool, where searchers compete for the right to exploit MEV opportunities from that transaction. The winning bidder, after extracting MEV, returns a portion of the profits (i.e., the bid amount or a certain percentage) to the user.

This model can be directly adapted to encrypted mempools: users need to disclose decryption keys (or partial information) to participate. However, most users are unaware of the opportunity cost of participating in such mechanisms; they only see the immediate returns and are willing to leak information. Similar cases exist in traditional finance: for example, the zero-commission trading platform Robinhood profits by selling user order flow to third parties through "payment-for-order-flow."

Another possible scenario is that large builders, under the pretext of censorship, force users to disclose transaction content (or related information). Censorship resistance is an important and controversial topic in the Web3 space, but if large validators or builders are legally bound (such as by the U.S. Office of Foreign Assets Control (OFAC) regulations) to enforce a sanctions list, they may refuse to process any encrypted transactions. Technically, users might prove that their encrypted transactions meet censorship requirements through zero-knowledge proofs, but this would add extra costs and complexity. Even if the blockchain has strong censorship resistance (ensuring that encrypted transactions are necessarily included), builders may still prioritize known plaintext transactions at the front of the block while placing encrypted transactions at the end. Therefore, transactions that need to ensure execution priority may ultimately be forced to disclose their content to builders.

Other Efficiency Challenges

Encrypted mempools will increase system overhead in various obvious ways. Users need to encrypt transactions, and the system must decrypt them in some manner, which increases computational costs and may also increase transaction size. As mentioned earlier, handling metadata will further exacerbate these overheads. However, there are also some efficiency costs that are not so apparent. In finance, if prices can reflect all available information, the market is considered efficient; delays and information asymmetry lead to market inefficiencies. This is the inevitable result of encrypted mempools.

Such inefficiencies lead to a direct consequence: increased price uncertainty, which is a direct product of the additional delays introduced by encrypted mempools. Consequently, the number of transactions failing due to exceeding price slippage tolerance may increase, wasting on-chain space.

Similarly, this price uncertainty may also give rise to speculative MEV transactions, which attempt to profit from on-chain arbitrage. Notably, encrypted mempools may make such opportunities more prevalent: due to execution delays, the current state of decentralized exchanges (DEXs) becomes more ambiguous, likely leading to decreased market efficiency and price discrepancies between different trading platforms. Such speculative MEV transactions will also waste block space, as they often terminate execution once arbitrage opportunities are not found.

Conclusion

The purpose of this article is to outline the challenges facing encrypted mempools so that people can redirect their efforts toward the development of other solutions, but encrypted mempools may still become part of the MEV governance solution.

One feasible approach is a hybrid design: some transactions achieve "blind sorting" through encrypted mempools, while others adopt different sorting schemes. For specific types of transactions (such as buy and sell orders from large market participants who can carefully encrypt or pad transactions and are willing to pay higher costs to avoid MEV), a hybrid design may be appropriate. This design also makes practical sense for highly sensitive transactions (such as repair transactions targeting vulnerable smart contracts).

However, due to technical limitations, high engineering complexity, and performance overhead, encrypted mempools are unlikely to become the "universal solution to MEV" that people hope for. The community needs to develop other solutions, including MEV auctions, application-layer defense mechanisms, and shortening final confirmation times. MEV will remain a challenge for the foreseeable future, requiring in-depth research to find a balance among various solutions to mitigate its negative impacts.

Original Link

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Bybit:合约交易强势平台!注册送50U+5000U储值返利!
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink