DWF Depth Report: AI in DeFi Outperforms Humans in Yield Optimization, but Complex Transactions Lag Behind by 5 Times

Agent activities will only continue to accelerate, and the infrastructure laid today will determine how on-chain finance operates in the next phase.

Author: DWF Ventures

Translated by: Deep Tide TechFlow

Deep Tide Overview: AI Agents have captured nearly one-fifth of DeFi's trading volume, outperforming humans in scenarios like yield optimization where rules are clear. However, when it comes to autonomous trading, top-performing AI lags behind top humans by more than one-fifth. This research breaks down AI's actual performance in various DeFi scenarios, worth a look for anyone interested in automated trading.

Key Points

Automation and agent activities currently account for about 19% of all on-chain activities, but true end-to-end autonomy has yet to be achieved.

In narrow, well-defined use cases like yield optimization, agents have demonstrated performance superior to humans and bots. However, for multifaceted actions like trading, humans outperform agents.

Among agents, model selection and risk management have the greatest impact on trading performance.

As agents are adopted on a large scale, there are multiple risks related to trust and execution, including witch attacks, strategy crowding, and privacy trade-offs.

Agent Activity Continues to Grow

Over the past year, agent activity has steadily increased, with both trading volume and transaction counts on the rise. We have seen Coinbase's x402 protocol leading significant developments, and players like Visa, Stripe, and Google have also joined in launching their own standards. Most of the infrastructure currently being built aims to serve two types of scenarios: channels between agents or agent calls triggered by humans.

While stablecoin trading is widely supported, the current infrastructure still relies on traditional payment gateways as the underlying layer, meaning it still depends on centralized counterparties. Therefore, the "fully autonomous" endpoint where agents can self-finance, self-execute, and continuously optimize based on changing conditions has not yet been achieved.

Agents are not entirely new to DeFi. Automation through bots has existed in on-chain protocols for years, capturing MEV or obtaining excess yields that cannot be realized without code. These systems perform very well under clearly defined parameters that do not change frequently or require additional oversight. However, the market has become more complex over time. This is where we see a new generation of agents stepping in, making on-chain activity a testing ground for such endeavors in recent months.

Actual Performance of Agents

According to the report, agent activity has grown exponentially, with over 17,000 agents launched since 2025. The total amount of automation/agent activity is estimated to cover more than 19% of all on-chain activities. This is not surprising as it is estimated that over 76% of stablecoin transaction volume is generated by bots. This indicates that there is substantial growth potential for agent activities in DeFi.

Agent autonomy exists on a wide spectrum, from chatbot-like experiences that require a high degree of human oversight to agents that can devise strategies based on goal inputs and adapt to market conditions. Compared to bots, agents have several key advantages, including the ability to respond to and execute new information within milliseconds, as well as the capability to expand coverage to thousands of markets while maintaining the same level of rigor.

Currently, most agents are still at the analyst to co-pilot level, as most remain in a testing phase.

Yield Optimization: Agents Excel

Liquidity provision is a field where automation has frequently occurred, with agents holding a total TVL of over $39 million. This figure primarily measures the assets that users deposit directly into agents, excluding capital routed through vaults.

Giza Tech is one of the largest protocols in this area, having launched the first agent application, ARMA, at the end of last year, aimed at enhancing yield capture for major DeFi protocols. It has attracted over $19 million in managed assets and generated over $4 billion in agent trading volume. The high ratio of trading volume to total managed assets indicates that agents frequently rebalance capital, enabling higher yield capture. Once capital is deposited into the contract, execution becomes automated, providing users with a simple one-click experience that requires minimal oversight.

ARMA's performance is measurably excellent, generating over 9.75% annualized yield for USDC. Even considering additional rebalancing fees and a 10% performance fee for agents, the yield still exceeds standard lending on Aave or Morpho. Nevertheless, scalability remains a key issue, as these agents have yet to be battle-tested to manage or scale to the size of major DeFi protocols.

Trading: Humans Hold a Significant Advantage

However, for more complex actions like trading, the results are much more diverse. Current trading models operate based on human-defined inputs and provide outputs according to preset rules. Machine learning expands on this by enabling models to update their behaviors based on new information without explicit reprogramming, pushing them into a co-pilot role. As fully autonomous agents enter, the trading landscape will undergo massive changes.

Several competitions have been held between agents and human versus agent trading contests, with results showing considerable variation between models. Trade XYZ hosted a competition on its platform for stocks to see how humans performed against agents. Each account had an initial fund of $10,000, with no restrictions on leverage or trading frequency. The results overwhelmingly favored humans, with top human performers outpacing top agents by over five times.

Meanwhile, Nof1 hosted an agent trading competition among several models (Grok-4, GPT-5, Deepseek, Kimi, Qwen3, Claude, Gemini) competing against each other, testing different risk allocations from capital preservation to maximum leverage. The results revealed several factors that help explain performance differences:

Holding Time: There is a strong correlation, with models that average holding positions for 2-3 hours significantly outperforming those that frequently flip positions.

Expectation: This measures whether models make money on average per trade. Interestingly, only the top three models have a positive expectation, meaning most models have more losing trades than winning.

Leverage: A lower leverage level of 6-8 times has proven to perform better than models operating on leverage exceeding 10 times, as higher levels accelerate losses.

Prompt Strategies: Monk Mode is the best-performing model so far, while Situational Awareness performs the worst. Based on model characteristics, it shows that focusing on risk management and fewer external sources leads to better performance.

Base Model: Grok 4.20 significantly outperformed others by over 22% across different prompt strategies and is the only model with average profitability.

Other factors like long/short preferences, trade size, and confidence scores did not have sufficient data or were not proven to have any positive correlation with model performance. Overall, the results indicate that agents tend to perform better within clearly defined constraints, meaning humans are still very needed in terms of goal configurations.

How to Evaluate Agents

Given that agents are still in the early stages, there is currently no comprehensive evaluation framework. Historical performance is often used as a benchmark for evaluating agents, but it is influenced by underlying factors, which provide stronger indications of capable agent performance.

Performance Under Different Volatility: Including disciplined loss control when conditions deteriorate, indicating that agents can identify off-chain factors affecting trading profitability.

Transparency vs. Privacy: Both sides have their trade-offs. Transparent agents, if they can be actively copied, essentially have no strategic advantage. Private agents face risks of internal extraction by creators, who can easily front-run their own users.

Information Sources: The data sources accessed by agents are crucial for determining how agents make decisions. Ensuring sources are credible and free of single dependencies is vital.

Security: Having smart contract audits and proper fund custody architecture to ensure backup measures are in place in the event of black swan incidents is critical.

Next Steps for Agents

For agents to be adopted on a large scale, much work remains in terms of infrastructure. This boils down to key issues around trust and execution for agents. The actions of autonomous agents come without guardrails, and instances of poor fund management have already emerged.

ERC-8004 will launch in January 2026, becoming the first on-chain registry that allows autonomous agents to discover each other, establish verifiable reputations, and collaborate securely. This is a key unlocking of DeFi composability, as trust scores are built into the smart contracts themselves, allowing for permissionless activity between agents and protocols. This does not guarantee that agents will always operate in a non-malicious manner, as vulnerabilities such as collusion reputations and witch attacks may still occur. Thus, there remains significant room to fill in areas such as insurance, security, and economic staking of agents.

As agent activity in DeFi expands, strategy crowding becomes a structural risk. Yield farming is the clearest precedent, as returns will compress with the popularization of strategies. The same dynamic may apply to agent trading. If a large number of agents train on similar data and optimize similar objectives, they will converge on similar positions and similar exit signals.

The CoinAlg paper published by Cornell University in January 2026 formalizes a version of this issue. Transparent agents can be arbitraged, as their trades are predictable and can be front-run. Private agents avoid this risk but introduce a different risk, wherein creators retain informational advantages over their users and can extract value through opacity from internal knowledge that is meant to be protected.

Agent activities will only continue to accelerate, and the infrastructure laid today will determine how on-chain finance operates in the next phase. As agent usage increases, they will self-iterate and become sharper in adapting to user preferences. Therefore, the key differentiating factor will boil down to trustworthy infrastructure, which will capture the largest market share.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。