MiniMax @MiniMax_AI released M2.7, claiming programming is close to Opus level, and MM-Claw's evaluation is close to Sonnet 4.6.
Just wanted to ask if everyone has started using it?
How is the actual effect?
If the capability has significantly improved, is it possible to lower the cost a bit!
They wrote in public documents:
The team built a research-type Agent framework based on the early version of M2, allowing the model to autonomously complete reinforcement learning skill construction, memory updates, and process optimization.
In an internal experiment, M2.7 autonomously ran over 100 iterations (analyzing failure trajectories, modifying scaffolds, running assessments, comparing results), ultimately achieving a 30% improvement in internal evaluations.
In the 22 machine learning tasks of MLE Bench Lite, the average score in three tests was 66.6%, equal to Gemini-3.1, only behind Opus 4.6 (75.7%) and GPT-5.4 (71.2%).
In programming, M2.7 scored 56.22% in SWE-Pro, matching GPT-5.3-Codex; VIBE-Pro 55.6%, close to Opus 4.6; Terminal Bench 2 scored 57.0%.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。