Author: PermaDAO
Decentralized storage is a method of data storage that does not rely on a single central control point. This method is in contrast to traditional centralized storage (such as traditional cloud storage services like Amazon S3 or Google Cloud), which is typically managed by a single enterprise or organization.
Mainstream Decentralized Storage
Currently, the mainstream decentralized storage options include Arweave, Filecoin, and Storj. Each of them has unique characteristics and design concepts:
- Arweave focuses on long-term or permanent data storage.
- Filecoin provides a decentralized market similar to traditional cloud storage, supporting flexible storage needs.
- Storj emphasizes providing secure and privacy-protected decentralized cloud storage services.
These three platforms all use blockchain technology, but their application scenarios, technical implementations, and payment models differ, making them suitable for different types of storage needs:
Arweave
- Objective: To provide a long-term, permanent data storage solution. Arweave aims to store data indefinitely, primarily for long-term data preservation.
- Technology: It uses a unique blockchain technology called "Blockweave." Unlike traditional blockchains, Blockweave includes references to early random blocks in each new block, designed to encourage long-term data preservation.
- Payment Model: Users pay a one-time fee for data storage, theoretically allowing permanent access to the data once stored.
Filecoin
- Objective: To create a decentralized storage market similar to traditional cloud storage services.
- Technology: Filecoin is the incentive layer for IPFS (InterPlanetary File System). It uses "storage proofs" and "proofs of spacetime" to ensure correct data storage.
- Payment Model: Users pay providers based on the amount of stored data and time. This is a more traditional leasing model, allowing users to increase or decrease storage as needed and pay accordingly.
Storj
- Objective: To provide users with a decentralized cloud storage solution, focusing on security and privacy protection.
- Technology: Storj uses encryption and sharding to protect data security and privacy. Data is encrypted and segmented into multiple small blocks before being uploaded, then distributed and stored on nodes globally.
- Payment Model: Storj's payment model is similar to traditional cloud storage, based on the used storage space and bandwidth.
In comparison, Arweave stands out with its emphasis on permanent storage, focusing more on data resistance to censorship and persistence. Both Filecoin and Storj use storage markets, emphasizing the reconstruction of storage markets using blockchain technology.
Business Architecture Analysis
Arweave's theoretical basis for permanent data storage is similar to "Moore's Law." Based on the cost of data storage from 1980 to the present, the storage cost decreases at a rate of 20% per year. According to this statistical rule, the cost of data storage will converge to a constant after an infinite number of years. Arweave's permanence is based on this, calculating the storage cost of data for 200 years when users store data.
At the same time, Arweave has designed a very elegant and simple data mining mechanism, which can be named "effective data mining."
"Effective data" refers to data that has already been stored in the Arweave network, and users have paid the storage cost for these effective data for 200 years. Another group in the network, the miners, use this effective data for mining and provide read services for the effective data. Unlike other storage blockchains, Arweave does not force miners to store data but establishes incentive rules to encourage each miner to maximize the storage of "effective data." In the Arweave network, the more "effective data" a miner stores, the greater their mining "power."
Assuming there is 100 TB of effective data in the Arweave network, it is not mandatory for a miner to store all 100 TB of data. In other words, a miner can mine by storing only 100 MB of data, but their mining power will be very small. If a miner chooses to store all 100 TB of data, their mining power will reach its maximum value.
In the "effective data mining" mechanism, the Arweave network incentivizes miners to store as much data as possible but does not force them to store all data. With this incentive model, is there a possibility of data loss? The following is a simulation of data loss:
Where the 0.5 in the first and second rows refers to a single node storing 50% of the data. Assuming there are 200,000 blocks in the network, with 200 nodes, each node randomly storing 100,000 blocks (50% of the block data), the probability of a single block being inaccessible can be calculated as 6.223^10-61. The data reliability provided by cloud services is 99.9999999%, or 10 to the power of 7. The Arweave simulation reaches an astonishing 61.
Both Filecoin and Storj have created decentralized storage markets using blockchain technology. Their models are similar to traditional order books in trading markets, where users place bids and specify data storage time and redundancy quantity, and miners accept profitable orders. To ensure fairness in the entire trading market, Filecoin has established a complex economic model, setting rules for penalties and small installment payments, among others. Its core technologies are replication proof and proof of spacetime.
Replication proof: Miners prove to users that data has been stored on dedicated physical devices. Each time a miner proves storage to a user, the network pays the miner.
Proof of spacetime: If only replication proof is used, it cannot guarantee continuous data storage. Miners can store this data only when submitting proof. Therefore, Filecoin has supplemented this with proof of spacetime to ensure continuous storage by miners.
In summary, the basis and implementation of Arweave's permanence are:
- The cost of permanence decreases year by year.
- Incentivizing miners through "effective data mining" to achieve permanent data storage.
Filecoin and Storj use blockchain technology to create decentralized storage markets. Their models are similar to traditional trading market order books, where bidders provide demands and miners accept orders to ensure data storage. The core technical points of Filecoin are replication proof and proof of spacetime.
Storage Operation
There are two ways to store data on Arweave. The first method is to send data directly to an Arweave node and pay in AR. The second method is to use the ANS-104 (Bundled) data binding protocol to batch data to Arweave.
Directly Storing Data on Arweave
Users only need to prepare a wallet holding AR to complete this action. Using the following code, a file named file.pdf can be stored on Arweave:
For more documentation, refer to: https://github.com/ArweaveTeam/arweave-js.
Using ANS-104 to Store Data on Arweave (Recommended)
Arweave's block production rate is relatively low, typically around 2 minutes, and a block can only process 1000 transactions, greatly limiting the number of transactions stored on Arweave. Although the storage capacity of a single Arweave transaction is unlimited, users can directly store data ranging from 100 MB to 10 GB in a single transaction on Arweave. To address the issue of transaction capacity expansion, ANS-104 was introduced.
ANS-104 is a multi-transaction binding technology that can bind tens of thousands of different data entities to a single regular Arweave transaction. It can be compared to Ethereum's Layer2 Rollup solution, with the difference being that ANS-104 does not compromise data security, and the bound data is 100% fully stored on Arweave.
The code demonstration for using ANS-104 to store data is as follows:
This code uses the arseeding light node as the data binding service. The arseeding light node is a fully open-source Arweave data node that supports all native Arweave node interfaces and extends the ANS-104 interface. Additionally, arseeding integrates the cross-chain payment protocol everPay, allowing users and developers to use various assets such as ETH, BNB, USDT, and USDC for data permanence, in addition to paying storage fees using AR.
For more documentation, refer to: https://web3infra.dev/docs/Arseeding/guide/quickStart.
Storage Costs
Currently, storing 1 GB of data on Arweave costs $7.5. For the latest storage fees, refer to: https://ar-fees.arweave.dev/.
Retrieving and Downloading Data from Arweave
Arweave has standardized GraphQL service interfaces, allowing individuals and organizations to implement Arweave indexing according to standards. Below are two typical and user-friendly indexing gateways:
- ArweaveNet gateway, with the most comprehensive indexing: https://arweave.net/graphql
- KNN3 gateway, providing real-time retrieval of arseeding node data with fast speed: https://knn3-gateway.knn3.xyz/arseeding/graphql
To download Arweave data, you only need to know the ARID or ItemID of the data. Here's a code example:
Storage Method for Filecoin
Unfortunately, Filecoin does not provide storage tools for ordinary users and developers. For ordinary developers, Filecoin is not in a usable state. Some sporadic technical documents suggest using third-party service providers for Filecoin storage solutions, but upon careful examination of the service provider's documentation, most of them only offer IPFS storage, and it's not guaranteed that the data is stored on Filecoin. Due to the author's limited expertise, a viable method to store data on Filecoin cannot be found, and there are no corresponding interfaces to directly retrieve data from Filecoin.
Storage Method for Storj
Storj's storage method is similar to Web2. Developers need to register on the official website and obtain an API-KEY. Storj's storage is compatible with the AWS S3 interface, so it will not be further elaborated here. Storj's storage costs are very low, with 1 GB of storage for one month costing only $0.004. However, when converted to a 200-year storage cost, it is slightly higher than Arweave, at $9.6.
From the storage operations, it is evident that Arweave's transaction processing model is consistent with that of blockchains like Bitcoin and Ethereum. Filecoin does not provide usable SDKs and interfaces, which is regrettable as the so-called storage leader is not usable for developers. Storj's storage method is completely consistent with Web2.
It is worth noting that Arweave is a native blockchain storage, and once data is sent to Arweave, it cannot be deleted or tampered with. Filecoin and Storj operate on a leasing model, allowing project parties to stop storage leasing services at any time. Under this model, the data does not possess blockchain characteristics, and its data properties are consistent with those stored in centralized cloud services.
To clearly distinguish the differences between data storage on Arweave and Filecoin, we can refer to the data on Arweave as "consensus data." Whether it's data on BTC or Ethereum, it belongs to consensus data, possessing the characteristics of immutability and traceability. The data stored in the Filecoin storage leasing market cannot be called consensus data.
Future Development
Decentralized storage has given rise to two completely different business lines. The business line represented by Arweave focuses on consensus data, emphasizing decentralized, censorship-resistant, and traceable data. The business line represented by Filecoin focuses on a decentralized market, emphasizing the allocation of storage resources and proving successful storage. Comparing it to the development of DeFi, the early IDEX used blockchain technology to create an order book market, a very traditional business model aimed at solving ticket exchange through the "eat order" mode. The explosion of DeFi was brought about by the liquidity mining technology of Uniswap's AMM trading model, which automated order operations, achieved liquidity composability, and ultimately led to the explosive growth of DeFi. In the current decentralized storage race, Filecoin, representing a business line, has similarly used blockchain technology to create an order book market, while Arweave has used a unified model similar to AMM to manage data supply and demand. Arweave's unified model is more conducive to data pricing and processing, making it easier to transform ordinary data into consensus data using Arweave. This data based on consensus may lead to a "data composability" explosion.
At the same time, it is necessary to mention the SCP theory (Storage-based Consensus Paradigm), whose core idea is that as long as there is consensus in data storage, applications composed of this data can also form consensus. SCP emphasizes off-chain computation, where data can be stored on various chains such as BTC and Ethereum, and unique states can be formed by aggregating data on the blockchain. Since these states will produce the same results when run on any computing unit, why do we still need to compute them on the chain, wasting so much computing resources?
The currently popular BRC20 and Bitcoin Script both use off-chain computation consensus. The BRC20 protocol and the storage consensus emphasized by Arweave SCP are consistent, using the blockchain as the data layer to provide immutable and traceable transaction data, with state computation entirely conducted off-chain. Leveraging Arweave's storage capabilities, the SCP theory can obtain a more powerful consensus data set. The Arweave SCP theory has developed a comprehensive engineering application solution called Permaweb, which is the ultimate version of a Bitcoin indexer. Permaweb can not only handle assets but also process text, images, and even videos. Imagine in the near future, a super powerful indexer can play streaming media, creating a completely decentralized TikTok.
Currently, the Permaweb solution supports a wide range of application types, making it easy to develop applications such as cloud storage, content collaboration, and games using this architecture. Data between Permaweb applications can be combined. For example, an author can upload their created text and copyright to Arweave through content collaboration, and in another game, developers can directly reference the author's content and allow players to pay the author for copyright.
The biggest challenge DePIN currently faces is blockchain performance. DePIN devices will enter millions of households, but no blockchain can handle such a huge user interaction. Most DePINs still use a centralized approach to process data, which will cause them to lose their decentralized characteristics. Consensus data can bring more powerful capabilities to DePIN. Once DePIN data is made permanent, it will also acquire composability features. For example, a green energy certificate can offset energy consumption in blockchain PoW calculations, become an identifier in content creation, and serve as a badge in games. Data and value will flow everywhere.
Consensus data also applies to the field of AI artificial intelligence. Human knowledge and history should be preserved forever, and consensus data can ensure that AI cannot pollute or tamper with human knowledge and history. Similarly, consensus data can serve as the best data raw material for AI, allowing AI to learn and process various valid information.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。