Microsoft Build 2026 Developer Conference: The era of "Agent First" has arrived, launching seven self-developed models all at once.

Microsoft Build releases seven models, the first flagship inference challenges Anthropic.

Written by丨Li Hailun

Edited by丨Xu Qingyang

On June 2, local time in the United States, the Microsoft Build 2026 Developer Conference opened in Mason, San Francisco. The theme of this conference focuses on the practical application of cutting-edge AI technology, during which Microsoft announced a series of products and updates covering self-developed AI models, agent applications, operating system security, developer tools, cloud services, and new hardware platforms.

At the 2025 Developer Conference, Microsoft established the direction of the "AI Agent Era," launched the Copilot Studio for multi-agent orchestration, Windows AI Foundry, and announced full support for Model Context Protocol, with GitHub Copilot launching a programming agent Coding Agent.

In Microsoft's narrative, the solution for 2025 was "what standards and frameworks to use in the agent era," while 2026 focuses on "how to truly utilize its own models and products" — the model layer has filled in the self-developed backbone that can carry heavy loads, and the product layer has moved agents from demonstration to full-stack implementation in systems, hardware, and the cloud.

The core releases of this conference can be divided into six sections: the MAI self-developed model family, the agent ecosystem represented by Scout and GitHub Copilot applications, Windows system-level AI security sandbox MXC, developer-oriented Surface RTX Spark Dev Box in system optimization, Project Solara new agent device platform, and developer tools and governance frameworks including Microsoft IQ, Rayfin, ASSERT, ACS, etc.

01 Seven models trained from scratch, rejecting distillation

The entire keynote speech unfolds with the vision statement from Microsoft CEO Satya Nadella as the main thread. After proposing the "Agent-First" strategic framework, executives from various business lines took the stage in turn to introduce specific products that bring this framework to life.

During the conference, Suleiman announced the launch of seven new models developed internally by Microsoft AI, collectively referred to as the MAI family.

He described MAI's mission as building a "mountain-climbing machine" that achieves a cycle of self-improvement through continuous computing input, better data, and more accurate assessment, keeping users at the forefront of technology.

Regarding training computation scale, Suleiman pointed out that the computational power used to train cutting-edge models has grown a trillion-fold, and is expected to grow another thousand times in the next three years. All MAI models from Microsoft are "climbed from scratch without distillation," meaning they do not rely on third-party model outputs for training.

Microsoft AI department head Suleiman introduces seven self-developed models

The specific models are as follows:

Flagship reasoning model MAI-Thinking-1, which is a medium-sized model. Microsoft states that in key software engineering tests, its performance can match the best models on the market. In blind comparative tests, human judges had a preference for it comparable to Sonnet 4.6. This model was trained from scratch using clean data without third-party model distillation.

Programming model MAI-Code-1-Flash is a highly efficient agentic coding model with 5 billion parameters, tailored for GitHub Copilot, VS Code, and integrated deeply into the Microsoft tech stack. Microsoft claims it can rival Haiku but at a lower cost.

Text-to-image model MAI-Image-2.5 and its ultra-efficient Flash variant supports text-to-image and image editing; Microsoft states it surpasses Google Nano Banana Pro in Arena scores.

Transcription model MAI-Transcribe-1.5 has state-of-the-art accuracy. It reportedly runs five times faster than competitor models and has built-in support for domain-specific terminology recognition in 43 languages.

Voice generation model MAI-Voice-2 provides high-quality, natural-sounding voice generation, supports 15 languages, can adapt sounds based on short samples, and includes anti-abuse protective measures. Its Flash variant will be released soon to achieve the same functionality at a lower cost.

All models share the same data specification, infrastructure, and evaluation framework. In addition to being distributed on Azure Foundry and optimized for Microsoft first-party products, these models will also be available to developers on Open Router as well as Fireworks and Baseten. Developers will be able to independently adjust model weights for the first time.

At the conference, Nadella introduced Microsoft Frontier Tuning, a method for enterprises to customize models using their own work data. The logic is that the most valuable data is not general corpora, but the actual trajectories, steps, and decisions of agents executing tasks within the enterprise.

Microsoft CEO Nadella introduces Frontier Tuning

This mechanism integrates MAI models into actual business processes, allowing the models to learn in real-world environments. Suleiman stated, "You are building your own model: trained with your data in your environment, controlled by you. Your institutional knowledge will become part of the model and belong only to you."

In terms of effectiveness, the MAI model adjusted for Excel is said to be on par with GPT-5.4 while improving efficiency by ten times. After adopting Frontier Tuning, McKinsey achieved the highest win rate among all tested models, with costs reduced by approximately ten times.

In the healthcare field, Microsoft announced a collaboration with the Mayo Clinic to jointly create a cutting-edge AI model for healthcare. This model will combine the clinical expertise of the Mayo Clinic, de-identified clinical data, and longitudinal insights with Microsoft's foundational AI capabilities.

Microsoft also revealed that the MAI models are being collaboratively designed with the self-developed Maia 200 chips, achieving a 1.4-fold efficiency improvement through combined software and hardware optimization.

02 The agent ecosystem is fully implemented

Microsoft announced at the conference a grand transformation towards "Agent First," aimed at automating how knowledge workers use software, embedding AI assistants into everyday office interactions.

Scout is the core agent product released this time. This "always online" AI Agent is built on the OpenClaw framework and can interact in Microsoft Teams like a human colleague.

Scout can browse users' work messages, calendars, and email inboxes, automatically complete tasks, reschedule conflicting meetings, and draft professional-sounding replies. Users can send direct commands to it in Teams and can also name it.

Microsoft's newly appointed Corporate Vice President Omar Shahin explained Scout's design philosophy: "Your company is essentially hiring your assistant. The whole point of having a personal assistant is that they are working even when you are not."

Scout is provided through Microsoft's Frontier program and requires a GitHub Copilot subscription. Microsoft is testing a desktop application for Scout, which will be launched for subscription users who choose "frontier" access. Within Microsoft, Shahin noted that the sales department is the largest and fastest-growing group using this tool.

The GitHub Copilot desktop application is another significant release. GitHub Chief Product Officer Mario Rodriguez introduced it as a "desktop experience that is agent-native, built on GitHub."

Through a unified "My Work" view, developers can see dynamic work across connected repositories, including active sessions, issues, pull requests, and background automation. Each session runs in its own Git worktree, with parallel agents that do not interfere with each other. The application features Agent Merge capabilities, allowing pull requests to complete reviews, checks, and merges. The Canvas interface facilitates bidirectional interaction between humans and machines, allowing developers to inspect, guide, and verify the work executed by the agent on their behalf.

The GitHub Copilot application offers a technical preview for Windows 11, Windows 11 on Arm, Mac, and Linux, requiring a GitHub Copilot subscription, and will be available to Copilot Free users in the future. The application supports both cloud and local sandboxes for code review, both of which come with policy support.

In the domain of agent security governance, Microsoft released the Agent Control Specification (ACS), a new open-source standard aimed at providing developers with a more consistent and granular approach to control the behavior of AI Agents. ACS enables development, compliance, and security teams to define policy files for Agents, stipulating what Agents can and absolutely cannot do, when human approval is needed, and what evidence should be recorded for review.

ACS is released as an SDK, accompanied by plugins for LangChain, OpenAI Agents SDK, Anthropic Agents SDK, AutoGen, CrewAI, Semantic Kernel, Microsoft.Extensions.AI, MCP tools, and others. Since policies can be written as a single file, they can be bundled with the Agent and travel with it across different frameworks and environments.

ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing) is another testing tool. This is an open-source framework that uses AI to transform high-level natural language descriptions of goals, strategies, or expected behaviors into structured scoring tests.

ASSERT receives concise linguistic descriptions of the expected behavior of AI models and generates a collection of acceptable and unacceptable behaviors, problem scenarios, and test cases, running tests against the target system and scoring them. It also logs the paths taken by the AI system, including intermediate operations and tool calls, for developers to investigate failure points.

03 The more autonomous an agent is, the more dangerous it becomes; Microsoft draws a line at the system level with MXC

As AI Agents grow increasingly powerful and autonomous, Microsoft has identified a key issue: the more autonomous an Agent is, the more useful it becomes, and the more dangerous it is to let it operate unrestricted on the enterprise network. Microsoft's official blog describes this as a "multi-layer system problem," with every interaction between Agents and humans, tools, applications, models, and other Agents "exposing new attack surfaces and introducing different failure modes."

To address this issue, Microsoft has launched Microsoft Execution Containers (MXC), a policy-driven execution layer embedded within the Windows operating system itself. Pavan Davuluri, Corporate Vice President of Windows and Device Execution at Microsoft, emphasized that this is critical for making AI Agents commercially viable, as they "revolve around security, containment, isolation, and user control," ensuring that Agents are safe enough for deployment by ordinary consumers and enterprises.

Microsoft CEO Nadella introduces the system-level security sandbox MXC

MXC is essentially an SDK and a policy model embedded in Windows and Windows Subsystem for Linux, providing what Microsoft calls a "composable sandbox spectrum." This spectrum ranges from lightweight process isolation (already adopted by the command-line interface of GitHub Copilot) to micro virtual machines, Linux containers, and full cloud instances running on Windows 365.

The system separates the execution of Agents from users’ desktops, clipboards, user interfaces, and input devices. Each Agent is bound to an identity, either a local ID or a cloud provisioned identity supported by Microsoft Entra, ensuring that every action of the Agent can be attributed, audited, and governed.

MXC is now available in an early preview version. Agent 365, integrated with Microsoft's enterprise security stack, will preview in July 2026, layering Entra identity service, Intune device management, Defender threat protection, and Purview data governance capabilities on top of MXC, allowing IT departments to centrally manage Agent isolation.

In terms of partners, OpenAI, Nvidia, Manus, Nous Research (the maker of Hermes Agent), and the OpenClaw open-source project have all announced plans to build on MXC.

Notably, OpenClaw's collaboration came from its creator, Peter Steinberger, proactively reaching out to Microsoft to express interest in collaboration, which ultimately evolved into a comprehensive platform-level partnership.

04 Three updates enable Edge's AI to "run offline"

Microsoft's Edge browser has also received local AI capability upgrades. Microsoft stated that since the introduction of Phi-4-mini at Build 2025, the team has expanded the edge-side AI capabilities based on feedback from web developers.

The first is Aion-1.0-Instruct, a smaller, faster, and more efficient local language model than Phi-4-mini. It can run on PCs with weaker GPU and CPU capabilities, and a developer preview will be available in July on Hugging Face.

The second is a language detection and translation API provided with Edge version 148. These two APIs are driven by the built-in edge-side AI model for JavaScript, allowing websites and browser extensions to identify text languages and translate between language pairs. Microsoft claims it "provides fast, high-quality translations supporting over 145 languages and is optimized for translation workloads on the web," and this service is free.

The third feature is voice recognition realized through the Web Speech API, available experimentally in the Edge Canary and Dev channels. This API helps developers integrate voice or audio input into websites and browser extensions, running locally on devices, while also having cloud-based speech-to-text and text-to-speech services as backup.

05 Developer tools and cloud service iterations

In terms of data intelligence, Microsoft launched Microsoft IQ, consolidating the previously independent four contextual sources into a shared foundation for Agents.

Amir Nez, Chief Technology Officer of Microsoft Fabric, likened it to the green code waterfalls in "The Matrix" not being mere decoration but the foundation of that world. He stated, "What we are doing in the data world is building a data-based reality for Agents."

The four contextual sources of Microsoft IQ are: Work IQ, which captures the way organizations operate daily, utilizing emails, documents, meetings, and schedules; Foundry IQ, which manages institutional knowledge, curating and indexing knowledge bases; Fabric IQ, which models the real-time operational status of businesses through data, defining entities, relationships, and business rules anchored by real-time signals of Fabric intelligence, set to be officially released in the coming months; and Web IQ, which adds real-time global context from the web.

With this contextual framework, an Agent is no longer just a tool that executes commands but a virtual employee that understands how the company operates.

Having a shared "foundation" is not enough. When an Agent begins generating applications, each application requires a backend, and if left unchecked, these applications may form new data silos outside the contextual layer. To address this, Microsoft launched Rayfin, an open-source SDK and CLI that directly deploys applications built by Agents to the Fabric platform as governed production backends, with application data defaulting to enter a unified OneLake data lake, which then feeds back to Microsoft IQ instead of piling up externally.

Microsoft positions it as a competitor to Supabase and Neon, with the core distinction being governance: all applications follow the same data and compliance pathway. Nez mentions it as a bidirectional process where Agents take information from enterprise data rules when building applications, and the data generated when the applications run, in turn, updates these rules, allowing the next Agent to utilize the latest information.

Microsoft also introduced the WSL container feature, enabling developers to create and manage Linux containers directly on Windows. Microsoft has provided it with a command-line interface and API, allowing the running of Linux containers within native Windows applications; this feature will offer public previews in the coming months.

To prevent developers from wasting time on environment configuration, Microsoft also released Windows Developer Configurations, which can quickly set up a new machine and apply developer-optimized configurations, automatically installing WSL, PowerShell 7, and Visual Studio Code while activating Git version control and showing hidden files in the file explorer.

06 Two new hardware products bring AI workloads back to the local end

This Build event wasn't just a display of models, Agents, and development tools; hardware was also present. As AI computation requires more computing power and agentic workflows need to run continuously, Microsoft has turned its attention to devices within developers' reach. Rather than continually renting expensive cloud GPUs, it is better to let these tasks be completed directly on local machines.

Andrew Hill, Vice President of Surface Product Company, announced two new devices:

Surface RTX Spark Dev Box is a compact developer PC equipped with the NVIDIA RTX Spark super chip, combining the NVIDIA Blackwell RTX GPU and NVIDIA Grace CPU, providing up to 1 Petaflop of AI computing power, with 128 GB of unified memory.

This device features an aluminum chassis that also serves as a heat sink, designed for long-duration training tasks, large model inference, and complex agentic processes. It comes pre-installed with Windows 11 Pro and has a developer configuration at the image level: dark theme, a simplified taskbar for development, removal of widgets, activation of "Do Not Disturb" mode, developer mode enabled, with PowerShell 7 set as the default Shell. WSL 2 is set up with GPU passthrough and CUDA support, along with pre-installed VS Code, GitHub Copilot, Git, Python, and Node.js.

In terms of security, Surface RTX Spark Dev Box is built on a chip-to-cloud security framework that complies with Microsoft's zero trust principles, including Secured-core PC architecture, BitLocker encryption, and Microsoft Defender protection, and can integrate with Entra ID and Intune for large-scale management and governance.

Hill explained, "The way developers build software is undergoing a fundamental change. AI model capabilities and complexities are increasing, agentic workflows require continuous computing power, and even tasks that do not require state-of-the-art models may incur cloud costs with each iteration."

The other device, Surface Laptop Ultra, is a high-performance laptop designed specifically for developers, creators, and technical professionals, which was launched earlier. Together, they represent the next step for Surface: creating dedicated devices for the builders of the future. The Surface RTX Spark Dev Box will be launched in the U.S. later this year and will be exclusively sold on Microsoft.com.

07 A new platform for running AI Agents instead of applications

Stevie Batiste, head of Microsoft's application science department, introduced an internal project known as Project Solara.

This is a new platform from chip to cloud, based on Android rather than Windows, aiming to enable devices to run AI Agents instead of applications. Batiste explained the starting point: "Boundaries are collapsing. You do not necessarily need traditional application models. You do not need traditional ways to develop experiences."

The first two concept devices were showcased at the Build conference:

A desktop-centered device placed next to a PC, responding to voice commands, logging users in via facial recognition, presenting the most urgent matters of the day. When connected to a monitor, it can transform into a fully functional Windows machine running in the cloud.

A wearable badge device reimagining the standard employee ID card. Pressing a fingerprint activates the Agent, tapping can record and transcribe conversations, and a built-in camera allows the Agent to take action based on what the user sees.

In a healthcare demonstration, this badge ran an Agent designed for healthcare providers, capable of scanning patient QR codes, recording and transcribing the consultation process, noting vital signs, and prescribing medications. In another application, the built-in camera scanned a brainstorming board filled with office renovation ideas and suggested adding greenery.

Batiste stated that Microsoft will not produce these devices themselves, but envisions hardware manufacturers and other industry partners transforming these reference designs into their own products, each targeting specific industries, companies, or scenarios.

08 Quantum chip upgrade, reliability increased a thousandfold

Microsoft also unveiled the next-generation topological quantum chip Majorana 2.

Compared to the previous Majorana 1, the core change is the superconducting material switched from aluminum to lead, which has increased the reliability of quantum bits by 1000 times, with the average quantum bit lifespan reaching 20 seconds, and some instances lasting a minute.

The lifespans of quantum bits using other technological pathways are typically only in the microsecond range. Based on this advancement, Microsoft has halved the expected timeline for the realization of scalable quantum computers, now anticipated to be achieved by 2029.

The development of this chip utilized the agentic AI capabilities of the Microsoft Discovery platform throughout the entire process. The AI agents undertook tasks such as manufacturing management, quantum state automated measurement, and interdisciplinary data analysis, compressing what usually takes weeks of measurement into several orders of magnitude, and identifying associations that are difficult for humans to detect from nearly two decades of accumulated data.

Microsoft Technical Fellow Chetan Nayak stated, "Agentic AI permeates almost everything we do." However, he stressed that AI only provides guidance, "it is always the scientists who remain in the loop."

The Microsoft Discovery platform was also officially launched at this conference, serving as an organizational-level platform for cutting-edge research that allows researchers to deploy human-guided autonomous agent teams for hypothesis generation, experimental optimization, and theoretical validation. Microsoft also introduced an early preview of the Microsoft Discovery application, available for free download for individuals who use a GitHub Copilot account to run it locally.

Special contributor Jin Lu also contributed to this article.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。