From "Computational Waste" to "Useful Computation": How Transformer-PoW Reshapes Consensus Mechanisms?

Original Author: Anastasia Matveeva, Gonka.ai

Achieving Consensus through "Waste"

Bitcoin has achieved an extraordinary feat: it has demonstrated at scale that untrusting strangers can collaborate without relying on banks, governments, or any central authority. For the first time, people can transfer funds to someone on the other side of the world without needing anyone's permission. The network cannot be shut down, assets are not subject to censorship, and it is practical and effective.

Bitcoin proposes using Proof-of-Work (PoW) as a consensus mechanism among untrusting participants. Its core logic is straightforward: miners compete to solve a "puzzle"—finding a random number (Nonce), combining it with block data, and inputting it into the SHA-256 hash function to produce an output that meets specific conditions, typically a hash value that starts with a certain number of leading zeros. For example, to generate a hash value with the first 70 bits as binary zeros, it would require an average of 2 to the power of 70 attempts. There are no shortcuts or clever algorithms to bypass trying different random numbers; it can only be achieved through continuous computation until a "lucky hit" occurs.

The revolutionary significance of this mechanism is that it greatly increases the cost of attacking the blockchain—if an attacker wants to alter historical records, they must redo all the computational work. It also achieves incentive alignment—miners' rewards are proportional to their computational effort, rather than dependent on existing wealth (although in practice, funding, hardware, and electricity costs still have an impact). Thus, a truly decentralized system has achieved large-scale implementation for the first time.

However, the cost is that these computations have no intrinsic value. The energy consumption is solely for calculating "leading zero hashes," and does not produce any practical use beyond that. Therefore, Bitcoin essentially exchanges massive computational waste for network security. For over a decade, this trade-off has been "good enough" in practice, making Bitcoin a powerful asset.

A New Era of Decentralization

Currently, artificial intelligence is undergoing rapid transformation. Large language models (LLMs) are becoming the infrastructure—services relied upon by businesses and users alike. However, most LLM inference tasks currently run on centralized servers controlled by a few companies, raising a series of urgent issues that need to be addressed:

Single point of control risk: A single company decides what types of models are available and who has access;
Censorship risk: Governments or corporations may pressure centralized service providers to implement censorship or restrict services.
Vendor lock-in: Users and developers have no choice but to rely on the current "gatekeepers."

These issues are precisely the core pain points that Bitcoin was originally designed to address. This leads to a key question: Can we build a decentralized LLM network that addresses these issues while avoiding the "resource waste" pitfalls of Bitcoin?

Existing Solutions and Their Limitations

Proof-of-Stake (PoS) attempts to solve the computational resource waste problem by "capital substituting computational power": validators must lock a certain amount of tokens as collateral, and their probability of being selected to validate blocks is proportional to the size of their staked assets, consuming only minimal energy.

However, this mechanism has a core flaw: capital distribution is inherently unequal. Take networks like Bittensor as an example; validators with substantial capital attract small token holders to delegate their stakes to them, creating a positive feedback loop of "the rich get richer"—having more capital attracts more delegations, leading to more rewards, which further attracts more delegations. Over time, voting power concentrates among initial wealth holders. Even if a subnet has high-performance GPUs and high-quality inference capabilities, if its validators hold limited capital, the subnet's influence will be minimal.

The end result is that voting power is monopolized by capital holders rather than actual computational contributors. Thus, while PoS addresses the resource waste issue, it gives rise to a new problem of wealth concentration.

An Alternative Solution

Thus, the core question shifts to: Can we direct computational resources towards tasks of actual value while retaining the "fairness" of Proof-of-Work?

Research teams have long attempted to address the resource waste problem of Proof-of-Work with different approaches. Since around 2017, researchers have begun exploring Proof of Useful Work—this mechanism still operates within the Proof-of-Work framework but shifts miners' computational tasks from "random hash puzzles" to tasks with potential economic or scientific value. Some proposals bind the "difficulty" of PoW to fine-grained problems, while others experiment with federated learning, matrix multiplication tasks, or zero-knowledge proof generation. The appeal of such proposals is evident: miners can maintain the fairness of PoW by completing "actually useful work," while reducing resource waste.

However, until recently, these attempts have not targeted LLM inference scenarios—they have mostly focused on discrete computational problems or batch learning, rather than supporting the "real-time Transformer inference" that current AI services require.

In fact, LLM inference is an ideal carrier for "useful work": its computational cost is high, its economic value is significant, and its importance is increasing. If the computational load of inference tasks can be used for network security, it would align "network security with actual computational demand."

In short, miners no longer need to compute hash values; instead, they participate in consensus by completing Transformer inference tasks. This is the core idea behind Transformer-based Proof of Work. Of course, the design of this mechanism still needs to address a series of key challenges.

It should also be noted that this mechanism is not limited to Transformers; it can be adapted to any more practical and mainstream model architecture in the future.

Design Challenges

Challenge 1: Assessing Computational Resources

In Bitcoin, "mining" is a full-time job for miners. However, for a decentralized LLM network that needs to provide services to users, most of the time nodes spend is on processing inference requests rather than executing Proof-of-Work tasks. Therefore, there are two feasible approaches:

The first method is theoretically viable but requires in-depth research: using the actual inference computational load of existing trained models to estimate participants' computational resources—by running inference tasks, measuring computational costs, and then calibrating node weights. This approach is efficient but needs to solve two major problems: how to adapt to the differences in input data and how to avoid potential exploits of the trained model structure, which requires significant R&D investment.

The second, more practical time-constrained approach: design each Proof-of-Work puzzle as a "short, fixed, predictable" task (for example, requiring only a few minutes), with the network committing to maintaining the same computational resources available throughout the entire epoch. This design provides greater flexibility for constructing unified puzzles.

Challenge 2: Aligning Tasks with LLM Computation

If a "time-constrained Proof-of-Work" is adopted, it will give rise to a new problem: if the PoW task is arbitrary, the direction of hardware optimization may deviate from "useful work."

The Bitcoin case has already demonstrated the consequences of "incentive misalignment": over time, the industry has developed specialized hardware (ASICs) solely for computing hash values.

However, Transformer-based Proof of Work can reverse this incentive logic: if the PoW task itself is Transformer inference, then hardware optimization for PoW will naturally enhance the inference performance for serving users—hardware optimization will align with "actual demand."

To achieve this, two points must be ensured: first, the PoW task must be "genuine Transformer inference"; second, the task must be updated in each cycle to prevent participants from pre-computing answers outside the designated time window.

Specifically, each round of PoW will generate a "new, randomly initialized Transformer." After receiving the challenge, participants have only a fixed time window to complete the solution, with no opportunity for prior analysis or pre-computation—each challenge is entirely new, ensuring that the work aligns with genuine inference. In this design, there are no shortcuts, nor can specialized hardware be developed for specific tasks (since the task updates each round); hardware improvements will only enhance general Transformer inference performance, rather than serving "mining-specific optimizations."

Challenge 3: Ensuring Security

Finally, the core question is "difficulty design": Is PoW secure enough?

The security logic of Bitcoin is clear and straightforward: generating a hash value with the first N bits as zero requires brute force, and the SHA-256 algorithm has no known mathematical shortcuts; its "difficulty" is simple and verifiable.

The mechanism design of Bitcoin is also very straightforward: by adjusting the random number, it verifies whether the hash value meets the condition of "first N bits as zero."

Let’s try to understand the direct mapping logic of Bitcoin tasks in the Transformer context. The random number (Nonce) in Bitcoin will transform into an "input sequence"—which can be a vector or token sequence, supporting dynamic adjustments, and can still be generated through positive integers like Bitcoin's random number. The requirement for "leading zeros" will translate into constraints on the output results:

The output vector of the Transformer must meet certain specific properties. Possible forms of constraints include: the output vector being close to a zero vector, the distance to the target vector being within a threshold range, having a specific size, or meeting other clearly defined standards. The specific definition of this condition is crucial, as some mathematical structures may have exploitable shortcuts.

The key difference from Bitcoin is that the cost of verifying whether the Transformer input sequence meets the conditions is higher—ordinary hardware can compute millions of Bitcoin Nonce hashes per second, while verifying Transformer inputs requires completing a full forward propagation computation. Participants cannot brute force through billions of candidate sequences; their ability is limited by inference speed—which is precisely what we need to measure as "computational workload."

As for how this system can achieve security comparable to Bitcoin, a deeper technical analysis is required (to be discussed in another article). The core logic is: by randomly initializing the Transformer and combining rigorous problem design, a search space is constructed where "complete Transformer inference must be completed to solve." A complete analysis of security will be discussed separately.

Making this system competitively secure against Bitcoin is a more complex technical story—this is another topic. The core logic is: by randomly initializing the Transformer and combining rigorous problem design, a search space is constructed where "complete Transformer inference must be completed to solve." A complete analysis of security is worth discussing separately.

The Proof of Work mechanism has been robustly operating for 15 years, but the design of Bitcoin has also brought significant issues: we consume vast computational resources to generate hash values with no practical use; while alternatives like PoS solve resource waste, they lead to wealth concentration among capital holders.

Transformer-based Proof of Work is another option: it retains the security and fairness of PoW while directing computational resources towards areas that the world truly needs. As a consensus mechanism for the AI era, it combines the security of PoW, aligns with real computational demands, and embodies the "practicality of the work itself"—this lays a new foundation for decentralized AI networks.

Original Link

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。