Harnesses could be the new moat in AI

Harnesses could be the new moat in AI.

On SWE-bench Pro, running the same model through different scaffolds swings the resolve rate by 22 points. Swapping between the six best frontier models inside the same scaffold moves it by less than one.

Meta and Harvard paired Sonnet 4.5 with a custom harness and scored 52.7%, beating Anthropic's own scaffold running the more expensive Opus at 52.0%. The cheaper model won because its harness was better.

@NousResearch built Hermes Agent around that same thesis. Every session writes its state to your machine. Conversation history goes into a local SQLite database with full-text search, and project context and personal preferences sit in markdown files the agent loads at startup. Complex tasks get saved as reusable skill files, and Honcho keeps a structured profile of how you work.

The model underneath is interchangeable. Closed agents keep your context with them, and leaving means rebuilding from scratch. Hermes lets you take your context to whichever model you want.

Closed labs hold an edge in long sessions because they fine-tune against thousands of hours of degraded runs, and open source providers haven't had the session data to match.

Hermes routed 3.2 trillion tokens in one week on OpenRouter which gives Nous a path to closing that gap.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Harnesses could be the new moat in AI

Selected Articles by Delphi Digital

Table of Contents

Related Articles