Microsoft Says Latest AI Models Beat Claude, Google's Nano Banana

CN
Decrypt
Follow
34 minutes ago

On the first day of the annual Microsoft Build event on Tuesday, the Windows developer unveiled seven new AI models, claiming they outperformed Anthropic's Claude Sonnet 4.6 and Google's Nano Banana 2 in blind testing and image-editing benchmarks.


The claim comes as Microsoft attempts to establish itself as a frontier AI developer rather than solely OpenAI's largest backer and infrastructure provider.


"Super excited to announce seven new world-class MAI models today," Microsoft AI CEO Mustafa Suleyman wrote on X. "They represent what we consider a new era in AI designed to keep you in control and on the frontier."


At the center of the release is MAI-Thinking-1, a reasoning model that Microsoft describes as its flagship text foundation model.



According to Suleyman, MAI-Thinking-1 was preferred over Anthropic's Claude Sonnet 4.6 in blind tests conducted by independent evaluators. He added that the model scored 97% on AIME 2025, a benchmark that measures advanced problem-solving and reasoning skills.


Suleyman said the SWE Bench Pro result places the model "right alongside Opus 4.6 on one of the toughest coding benchmarks."


The company also introduced MAI-Code-1-Flash, a lightweight coding model built for GitHub Copilot and Visual Studio Code; MAI-Image-2.5 and its Flash variant, which Microsoft says outperform Google's Nano Banana Pro on image-editing tasks; MAI Transcribe-1.5, a transcription model that supports 43 languages; and MAI-Voice-2, a speech-generation model capable of producing natural-sounding voices in 15 languages and adapting to a speaker from a short audio sample.





“This is an extraordinary time in technology. The compute used to train frontier models has increased by a factor of one trillion,” Suleyman said in a separate blog post announcing the new models. “Now we expect another thousand-fold increase over the next three years, which in turn means more advanced capabilities, and the continued rollout of ever more effective AI.”


The announcement comes as competition among leading AI developers continues to intensify.


Last week, Anthropic announced the launch of its latest flagship model, Opus 4.8, which the company said is faster and smarter on benchmark tests and comes with a suite of new features. On Tuesday, Anthropic announced an expansion of its Project Glasswing, giving 150 companies access to its new cybersecurity-focused Mythos model.


Meanwhile, at Google I/O in May, Google unveiled Gemini Omni, a multimodal AI model that combines Gemini with the company's Veo, Nano Banana, and Genie media-generation models, alongside Gemini Spark, a cloud-based AI agent designed to manage tasks across apps and workflows on a user's behalf.


Microsoft’s new model launch suggests a broader effort to build proprietary AI systems as it expands beyond its longstanding reliance on OpenAI technology, saying that MAI “delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost.”


“Developers and businesses have been crying out for AI that delivers on their terms and under their say,” Suleyman wrote. “We see this as a major step towards delivering that.”


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink