What is data availability? What data availability issues does L2 face? Why is there so much controversy over the data availability layer L2? This article will focus on these questions and attempt to unveil the mystery of data availability.
Author: Jian Shu
Ethereum Foundation researcher Dankrad Feist once tweeted that L2 is not L2 without using Ethereum for data availability. According to him, many chains would be kicked out of the L2 team, such as Arbitrum Nova, Polygon, and Mantle.
So, what exactly is data availability? What data availability issues does L2 face? Why is there so much controversy over the data availability layer L2? This article will focus on these questions and attempt to unveil the mystery of data availability.
What is data availability
Simply put, data availability refers to the block producer publishing all transaction data of the block to the network so that validators can download it.
If a block producer publishes complete data and allows validators to download it, we say the data is available; if it withholds some data so that validators cannot download the complete data, we say the data is unavailable.
Difference between data availability and data retrievability
We often confuse data availability with data retrievability, but they are quite different.
Data availability involves the stage when a block is produced but not yet added to the blockchain through consensus, so data availability is not related to historical data but to whether newly published data can be accessed through consensus.
Data retrievability involves the stage when data has been permanently stored in the blockchain through consensus, i.e., the ability to retrieve historical data. Nodes in Ethereum that store all historical data are called archive nodes.
Therefore, L2BEAT co-founder once stated in a long tweet that full nodes are not obligated to provide us with historical data, and the reason we can obtain it is simply because full nodes are kind enough.
He also stated that the term "data availability" may lead to misunderstandings about its function and should be replaced with "data publishing," a view that was also endorsed by the founder of Celestia.
Data availability issues in L2
Although the concept of data availability originates from Ethereum, we are currently focusing on the data availability at the L2 level.
In L2, the sequencer is the block producer, and they need to publish enough transaction data for validators to check the validity of transactions.
However, in this process, there are two issues: ensuring the security of the verification mechanism and reducing the cost of data publishing. The following will provide specific explanations.
Issue of ensuring the security of the verification mechanism
We know that OP Rollup uses fraud proofs to verify the validity of transactions, while ZK Rollup uses validity proofs.
- For OP Rollup: If the sequencer does not publish complete data that can trace back to the block, challengers in the fraud proof will be unable to launch a valid challenge.
- For ZK Rollup: Although validity proofs themselves do not require data availability, ZK Rollup as a whole still needs data availability. Without data that can trace back to the block, users will be unable to know their balance and may likely lose assets.
To ensure secure verification, current L2 sequencers generally publish L2's state data and transaction data on the more secure Ethereum, relying on Ethereum for settlement and obtaining data availability.
Therefore, the data availability layer is actually where L2 publishes transaction data, and mainstream L2s currently use Ethereum as the data availability layer.
Issue of reducing the cost of data publishing
Current L2s simply have data availability and settlement occurring on Ethereum, which, although providing sufficient security, also incurs significant costs. This is the second issue that L2 faces, i.e., how to reduce the cost of data publishing.
The total gas paid by users to L2 mainly consists of the gas for executing transactions and the gas for L2 to submit data to L1. The former cost is minimal, while the latter is the major part of user expenses. Among these, the transaction data published to ensure data availability accounts for the main part of the data submitted from L2 to L1, while proof data for verifying transaction validity only accounts for a small part.
Therefore, to make L2 overall cheaper, the cost of data publishing must be reduced. So, how can costs be reduced? There are mainly two methods:
- Reduce the cost of publishing data on L1, such as the upcoming EIP-4844 upgrade for Ethereum.
- Similar to Rollup, detach transaction execution from L1, and data availability can also be detached from L1 to reduce costs, i.e., not using Ethereum as the data availability layer.
Controversy over L2's data availability layer
To discuss the controversy over L2's data availability layer, we need to start with modular blockchain. Modular blockchain decouples the core functions of the overall blockchain into relatively independent parts and extends the performance of a single blockchain through the combination of various dedicated networks.
Although there is still some controversy over the layering of modular blockchain, the widely accepted division of modular blockchain is into four layers: execution, settlement, consensus, and data availability. The functions of each module are as shown in the following figure.
Modular blockchain is similar to LEGO bricks, allowing for customization and the use of the best blocks to build a good model, mitigating the "impossible triangle" problem of blockchain.
However, current L2s, apart from separating the execution layer from Ethereum, still perform the functions of the other three layers on Ethereum. But due to cost considerations, many L2s are preparing to detach the data availability layer from Ethereum and use Ethereum only for settlement and consensus.
Interestingly, Ethereum seems reluctant to allow L2 to obtain data availability from elsewhere. Ethereum Foundation researcher Dankrad Feist once tweeted that not using Ethereum as the data availability layer means it's not Rollup, and therefore not L2.
At the same time, in L2BEAT's latest definition of L2, it is pointed out that scaling solutions that do not publish data on L1 are not L2, because using off-chain data availability solutions cannot guarantee that operators will provide the published data.
Of course, there is no definitive conclusion on what L2 is at the moment. The Ethereum Foundation members and L2BEAT insist that L2 should keep the data availability layer on Ethereum, seemingly out of security considerations. But is there actually a concern about shaking Ethereum's position?
Ethereum's vision is to become a supercomputer platform. Later, in order to improve network performance, it had to develop Rollup and many ecosystems moved to the cheaper L2. However, because security is provided by Ethereum, it did not have much impact on Ethereum's position. But if L2 detaches the data availability layer related to data publishing from Ethereum, it fundamentally weakens the reliance on Ethereum's security and gradually moves away from Ethereum, posing a threat to Ethereum's position.
However, this does not prevent the flourishing development of projects related to the data availability layer. In the next article about data availability, the author will detail the main data availability solutions and specific related projects currently available on the market. Stay tuned.
References:
[1] Ethereum Documentation: Data Availability
[2] Misunderstanding of Data Availability: DA=Data Publishing≠Historical Data Retrieval
[3] Expelling Validium? Reinterpreting Layer2 from the perspective of the proposer of Danksharding
[4] Data availability checks
[5] A note on data availability and erasure coding
[6] IOSG Ventures: Disassembling the data availability layer, the overlooked LEGO bricks in the modular future
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。