金色财经|Jun 12, 2026 10:48
[Epoch AI Evaluates Mythos: Exploitation Leads by 7 Months, Vulnerability Discovery Advantage Overestimated]
According to a report by Golden Finance, Epoch AI has released a research report systematically evaluating the cybersecurity capabilities of Anthropic's Claude Mythos series models. Researchers constructed a "Cyber Capability Index" (Cyber-ECI) based on multiple security benchmarks. The results show that Mythos Preview significantly outperforms in vulnerability exploitation development capabilities, leading linear development trends by approximately 7 months and surpassing OpenAI GPT-5.5 by about 3 months relative to leading trends.
Previously, some scholars questioned whether Mythos Preview's performance was comparable to GPT-5.5. The report points out that the discrepancy stems from early evaluations using weaker internal versions rather than the April version later made available to Glasswing partners. Additionally, some test sets were nearing saturation, making it difficult to create performance gaps. Actual security tests indicate that Mythos Preview can frequently autonomously construct arbitrary code execution exploits, whereas earlier models were almost incapable of doing so.
In terms of vulnerability discovery, there is a lack of strong evidence for similar leading trends. Although after the release of Mythos Preview, high-risk and critical vulnerability (CVE) reports from 21 well-known institutions surged by 142% and 262% in April and May compared to the 2025 baseline, the spike in data may primarily be attributed to financial investment. Anthropic provided Glasswing with API credits worth up to $100 million, attracting a large number of security personnel to conduct concurrent searches within a short period.
For example, in the highly scrutinized curl codebase tests, Mythos Preview identified only one low-risk vulnerability along with four false positives, performing no better than traditional auditing tools. Startup AISLE also confirmed that some lightweight open-source models could identify the same vulnerabilities. Epoch AI believes that Mythos's actual advantage lies in its extremely low false positive rate and more accurate harm classification. Compared to directly discovering new vulnerabilities, more precise classification can help defenders significantly reduce the time and cost of manual reviews.
Share To
HotFlash
APP
X
Telegram
CopyLink