EIP-4444 can solve the historical growth problem of Ethereum and leave space for increasing the gas limit.
Authors: Storm Slivkoff, Georgios Konstantopoulos
Translation: Luffy, Foresight News
Historical growth is currently the biggest bottleneck for Ethereum scalability. Surprisingly, historical growth has become a bigger issue than state growth. Within a few years, historical data will exceed the storage capacity of many Ethereum nodes.
The good news is:
- Historical growth is an easier problem to solve than state growth.
- Solutions are actively being developed.
- Solving historical growth will alleviate the state growth problem.
In this article, we will continue to explore the Ethereum scalability issues from Part 1, shifting the focus from state growth to historical growth. Using detailed datasets, our goal is to 1) technically understand Ethereum's scalability bottleneck and 2) facilitate discussions on the optimal solution for Ethereum gas limits.
What is historical growth?
History is the collection of all blocks and transactions executed by Ethereum throughout its entire lifecycle, encompassing all data from the genesis block to the current block. Historical growth is the accumulation of new blocks and transactions over time.
Figure 1 shows the relationship between historical growth and various protocol metrics and Ethereum node hardware constraints. Compared to state growth, historical growth is subject to a different set of hardware constraints. Historical growth puts pressure on network IO as new blocks and transactions must be transmitted across the entire network. It also puts pressure on node storage space, as each Ethereum node stores a complete copy of the historical records. If historical growth exceeds these hardware constraints, nodes will no longer be able to achieve stable consensus with their peer nodes. For an overview of state growth and other scalability bottlenecks, please refer to Part 1 of this series here.

Figure 1: Ethereum scalability bottleneck
Until recently, most of each node's network throughput was used to transmit historical records (such as new blocks and transactions). With the introduction of blobs in the Dencun hard fork, this situation has changed. Blobs now account for a significant portion of node network activity. However, blobs are not considered part of historical records because 1) they are only stored by nodes for 2 weeks and then discarded, and 2) they do not require the repetition of data since Ethereum's genesis. Due to (1), blobs do not significantly increase the storage burden of each Ethereum node. We will discuss blobs in the later part of this article.
In this article, we will focus on historical growth and discuss the relationship between history and state. Since state growth and historical growth share some overlapping hardware constraints, they are related issues, and solving one problem can help solve the other.
How fast is historical growth?
Figure 2 shows the historical growth rate of Ethereum since its genesis. Each vertical line represents a month of growth. The y-axis represents the monthly historical growth in gigabytes. Transactions are categorized by their "target address" and represented in size using RLP bytes. Contracts that cannot be easily identified are classified as "unknown." The "other" category includes a range of small categories such as infrastructure and gaming.

Figure 2: Ethereum historical growth rate over time
Several key points from the above chart:
- Historical growth rate is 6 to 8 times faster than state growth: The historical growth rate recently peaked at 36.0 GiB/month and is currently at 19.3 GiB/month, while the state growth rate peaked at approximately 6.0 GiB/month and is currently at 2.5 GiB/month. We will discuss the comparison of historical and state growth in terms of growth and cumulative size later in this article.
- Before Dencun, historical growth rate was accelerating: While state growth has been roughly linear over the years (as seen in Part 1), historical growth has been superlinear. Considering that the growth rate of linear growth leads to quadratic growth in overall scale, the growth rate of superlinear growth leads to scale exceeding quadratic growth. This acceleration abruptly stopped after Dencun. This is the first time Ethereum has experienced a significant decrease in historical growth rate.
- Most recent historical growth is mainly from Rollup: Each L2 publishes its transaction copies back to the mainnet, generating a large amount of historical records and making Rollup the most significant contributor to historical growth over the past year. However, Dencun allows L2 to use blobs instead of historical records to publish their transaction data, so Rollup no longer generates the majority of Ethereum's historical records. We will provide a more detailed introduction to Rollup later in this article.
Who are the biggest contributors to Ethereum historical growth?
The amount of historical data generated by different contract categories reveals how Ethereum's usage patterns have evolved over time. Figure 3 shows the relative contributions of various contract categories. This data is normalized from the same data as Figure 2.

Figure 3: Contributions of different contract categories to historical growth
These data reveal four different periods of Ethereum usage patterns:
- Early period (purple): There was almost no on-chain activity in the initial years of Ethereum. Most of these early contracts are now difficult to identify and are labeled as "unknown" in the chart.
- ERC-20 era (green): The ERC-20 standard was finalized at the end of 2015, but significant development did not occur until 2017 and 2018. ERC-20 contracts became the largest source of historical growth in 2019.
- DEX/DeFi era (brown): DEX and DeFi contracts appeared on-chain as early as 2016 and gained attention in 2017. However, it was not until the DeFi summer of 2020 that they became the largest category of historical growth. DeFi and DEX contracts accounted for over 50% of historical growth in parts of 2021 and 2022.
- Rollup era (gray): In early 2023, L2 Rollup began executing more transactions than the mainnet. In the months before Dencun, they generated about 2/3 of Ethereum's historical records.
Each era represents increasingly complex Ethereum usage patterns compared to the previous one. Over time, complexity can be seen as a form of Ethereum expansion that cannot be measured by simple metrics such as transactions per second.
In the most recent data month (April 2024), Rollup no longer generates the majority of historical records. It is currently unclear whether future historical records will come from DEX and DeFi, or if new usage patterns will emerge.
What about blobs?
The Dencun hard fork introduced blobs, significantly altering the dynamics of historical growth by allowing Rollup to use inexpensive blobs instead of historical records to publish data. Figure 4 zooms in on the historical growth rate before and after the Dencun upgrade. The chart is similar to Figure 2, except each vertical line represents a day instead of a month.

Figure 4: Impact of Dencun on historical growth
From this chart, we can draw several key conclusions:
- Since Dencun, the historical growth of Rollup has decreased by about 2/3: Most Rollups have switched from call data to blobs, significantly reducing the amount of historical records they generate. However, as of April 2024, there are still some Rollups that have not switched from call data to blobs.
- Since Dencun, the total historical growth has decreased by about 1/3: Dencun only reduced the historical growth of Rollup. Historical growth in other contract categories has slightly increased. Even after Dencun, historical growth is still 8 times that of state growth (details in the next section).
Although blobs have reduced the historical growth rate, they are still a new feature of Ethereum. It is currently unclear at what level the historical growth rate will stabilize with the existence of blobs.
How fast is acceptable historical growth?
Increasing the gas limit will increase the historical growth rate. Therefore, proposals to increase the gas limit (such as Pump the Gas) must consider the relationship between historical growth and the hardware constraints of each node.
To determine an acceptable historical growth rate, it is necessary to understand how long the current node hardware can sustain the network and storage. Networking hardware may be able to sustain the status quo indefinitely, as the historical growth rate is unlikely to return to its peak before Dencun before increasing the gas limit. However, the storage burden of history will continue to increase over time. Under the current storage strategy, the storage disk of each node will eventually be filled with historical records, which is inevitable.
Figure 5 shows the storage burden of Ethereum nodes over time and predicts the growth of the storage burden over the next 3 years. The prediction is based on the growth rate in April 2024. This growth rate may increase or decrease with changes in future usage patterns or gas limits.

Figure 5: Size of historical records, state, and full node storage burden
From this chart, we can draw several key conclusions:
- The storage space occupied by historical records is approximately 3 times that of the state. This difference will continue to increase over time, as the historical growth rate is approximately 8 times that of the state.
- 1.8 TiB is the critical threshold, and many nodes will be forced to upgrade their storage disks. 2TB is a common storage disk size, providing only 1.8TiB of usable space. Note that TB (terabyte = 10^12 bytes) and TiB (tebibyte = 1024^4 bytes) are different units. For many node operators, the "real" critical threshold may be even lower, as merged validators must run a consensus client alongside an execution client.
- The critical threshold will be reached within 2 to 3 years. Increasing any amount of the gas limit will correspondingly accelerate the arrival of this time. Reaching this threshold will impose a significant maintenance burden on node operators and require the purchase of additional hardware (e.g., $300 NVME drives).
Unlike state data, historical data is much less frequently accessed and can theoretically be stored separately from state data on cheaper storage media. This can be achieved with some clients like Geth.
In addition to storage capacity, network IO is another major constraint on historical growth. Unlike storage capacity, network IO constraints will not pose an immediate problem for nodes, but they will become important for future increases in the gas limit.
To understand how much historical growth typical Ethereum node's network capacity can support, it is necessary to know the relationship between historical growth and various network health metrics, such as reorg rate, slot misses, finality misses, proof misses, sync committee misses, and block submission delays. The analysis of these metrics is beyond the scope of this article but can be found in previous investigations into the health of the consensus layer. Additionally, the Ethereum Foundation's Xatu project has been building public datasets to expedite such analyses.
How to solve the historical growth problem?
Historical growth is an easier problem to solve than state growth. It can be almost entirely addressed by the proposed EIP-4444. This EIP will change each node from storing the entire Ethereum historical data to only storing one year of historical data. After implementing EIP-4444, data storage will no longer be a bottleneck for Ethereum scalability, and gas limit increases will not be constrained in the long run. EIP-4444 is necessary for the long-term sustainability of the network; otherwise, the historical growth rate will quickly necessitate regular hardware updates for network nodes.
Figure 6 shows the impact of EIP-4444 on the storage burden of each node over the next 3 years. It is similar to Figure 4 but with lighter lines added to represent the storage burden after the implementation of EIP-4444.

Figure 6: Impact of EIP-4444 on the storage burden of Ethereum nodes
From this chart, several key conclusions can be drawn:
- EIP-4444 will halve the current storage burden. The storage burden will decrease from 1.2 TiB to 633 GiB.
- EIP-4444 will stabilize the historical storage burden. Assuming a constant historical growth rate, historical data will be discarded at the rate it is generated.
- After EIP-4444, the storage burden of nodes will take many years to reach today's level. This is because state growth will be the only factor increasing the storage burden, and the growth rate of the state is slower than that of historical growth.
After implementing EIP-4444, historical growth will still impose a certain level of storage burden, as nodes will store one year of historical records. However, this burden will not be difficult to manage even as Ethereum scales globally. Once the method of storing historical records is proven to be reliable, the one-year expiration time of EIP-4444 may be shortened to a few months, weeks, or even shorter.
How to store Ethereum's historical records?
EIP-4444 raises a question: If historical records are not stored by Ethereum nodes themselves, how should they be stored? Historical records play a crucial role in the verification, accounting, and analysis of Ethereum, so preserving historical records is essential. Fortunately, preserving historical records is a simple problem that only requires 1/n honest data providers. This is in stark contrast to the state consensus problem, which requires 1/3 to 2/3 of participants to be honest. Node operators can verify the authenticity of the historical data set by 1) replaying all transactions since the genesis block and 2) checking if these transactions reproduce the same state root as the current blockchain endpoint.
There are many methods for preserving historical records.
Torrents/P2P: Torrents are the simplest and most reliable method. Ethereum nodes can periodically package partial historical records and share them as public Torrent files. For example, a node might create a new historical Torrent file every 100,000 blocks. Node clients like erigon have already implemented this process to some extent in a non-standardized manner. To standardize this process, all node clients must use the same data format, parameters, and P2P network. Nodes will be able to choose whether to participate in this network based on their storage and bandwidth capabilities. The advantage of Torrents is the use of a high lindy open standard supported by a large amount of data tools.
Portal Network: Portal Network is a new network designed specifically for hosting Ethereum data. It is a method similar to Torrents, while also providing additional features to make data verification easier. The advantage of Portal Network is that these additional verification layers provide utilities for light clients to effectively verify and query shared datasets.
Cloud Hosting: Cloud storage services such as AWS's S3 or Cloudflare's R2 provide a cheap and high-performance option for storing historical records. However, this method brings more legal and operational risks, as it cannot be guaranteed that these cloud services will always be willing and able to host cryptocurrency data.
The remaining implementation challenges are more social challenges than technical challenges. The Ethereum community needs to coordinate specific implementation details to integrate them directly into each node client. In particular, executing a full sync from the genesis block (rather than a snapshot sync) will require retrieving historical records from historical record providers rather than Ethereum nodes. These changes do not require a hard fork from a technical standpoint, so they can be implemented earlier than the next Ethereum hard fork, Pectra.
All of these historical preservation methods can also be used by L2 to store their blob data released to the mainnet. Compared to historical preservation, blob preservation is 1) more difficult due to the much larger total data volume, and 2) less critical because blobs are not necessary for replaying mainnet history. However, blob preservation is still necessary for each L2 to replay its own history. Therefore, some form of blob preservation is important for the entire Ethereum ecosystem. Additionally, if L2 develops robust blob storage infrastructure, they may also be able to easily store L1 historical data.
Directly comparing the datasets stored by different types of nodes before and after EIP-4444 would be helpful. Figure 7 shows the storage burden of different types of Ethereum nodes. State data refers to accounts and contracts, historical data refers to blocks and transactions, and archive data is a set of optional data indices. The byte counts in this table are based on the most recent reth snapshot, but the numbers for other node clients should be roughly equivalent.

Figure 7: Storage burden of different types of Ethereum nodes
In other words,
Archive nodes store state data, historical data, and archive data. They can be used when someone wants to easily query historical chain states.
Full nodes store only historical data and state data. Most nodes today are full nodes. The storage burden of full nodes is approximately half that of archive nodes.
After EIP-4444, full nodes will only store state data and the most recent year of historical data. This will reduce the storage burden of nodes from 1.2 TiB to 633 GiB and stabilize the storage space for historical data.
Stateless nodes, also known as "light nodes," do not store any datasets and can immediately verify at the end of the chain. This node type may become possible once Verkle attempts or other state commitment schemes are added to Ethereum.
Finally, there are additional EIPs that can limit the historical growth rate, not just adapt to the current growth rate. This will help to stay within network IO constraints in the short term and within storage constraints in the long term. Although EIP-4444 is still necessary for the long-term sustainability of the network, these other EIPs will help Ethereum scale more effectively in the future:
EIP-7623: Repricing call data to make certain transactions with excessive call data more expensive. Making these usage patterns more expensive will force some of them to switch from call data to blob, reducing the historical growth rate.
EIP-4488: Imposing limits on the total amount of call data that can be included in each block. This will impose stricter limits on the growth rate of historical records.
These EIPs are easier to implement than EIP-4444, so they may serve as short-term measures before EIP-4444 is put into production.
Conclusion
The purpose of this article is to understand 1) how historical growth works and 2) the methods to address this issue through data. Much of the data in this article is difficult to obtain through traditional means, so we hope to provide some new insights into the historical growth problem.
Historical growth as a bottleneck for Ethereum scalability has not received enough attention. Even without increasing the gas limit, the current practice of saving historical records in Ethereum will force many nodes to upgrade their hardware within a few years. Fortunately, this is not an insurmountable problem. A clear solution is already present in EIP-4444. We believe that the implementation of this EIP should be expedited to make room for future gas limit increases.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。