Microsoft researchers disclosed a now-patched vulnerability in Anthropic's Claude Code GitHub Action that could have allowed attackers to expose credentials stored in software development pipelines by manipulating the AI agent through malicious GitHub content.
In a blog post on Friday, Microsoft warned that AI coding agents running inside CI/CD workflows may create new security risks because those environments often have access to API keys, cloud credentials, and other sensitive information.
“We began this research after observing prompt injection attempts in public repositories using AI-assisted GitHub workflows across multiple vendors, where attacker-controlled issue or [pull requests], content is processed by the AI agent and could influence its tool use,” Microsoft wrote.
On GitHub, a pull request allows developers to propose changes to a code repository and have those changes reviewed before they are approved and merged.
The report comes as prompt injection attacks have emerged as one of the biggest security threats facing AI agents. In a prompt injection attack, an attacker hides instructions in content such as emails, documents, websites, or code comments, causing an AI system to follow those instructions instead of the user's.
Launched in October, Claude Code is Anthropic's AI coding agent for software development tasks. The tool drew scrutiny in March after Anthropic accidentally leaked more than 500,000 lines of its source code, exposing details of its internal architecture and prompting widespread analysis by researchers and developers.
According to Microsoft, attackers could use prompt injection attacks hidden in GitHub issues, pull requests, or comments to manipulate Claude Code into accessing files containing sensitive credentials.
To test the vulnerability, Microsoft created a GitHub workflow and disguised malicious instructions behind content hosted on a domain it controlled, allowing the researchers to bypass Claude's safety protections. The prompt injection attack tricked Claude into reading sensitive credentials and altering them to evade both Claude's safeguards and GitHub's secret-scanning tools. Microsoft said an attacker could then reconstruct the credential and exfiltrate it through issue comments, workflow logs, web requests, or shell commands.
“To bypass Sonnet’s refusal safety mechanisms, we obscured the shell payload behind a response from our controlled domain," the firm said. "We also enabled the workflow to be triggered by users with no 'write' permissions to ensure Anthropic’s environment variables scrub mitigations were active during our tests.”
Anthropic patched the flaw on May 5 with Claude Code version 2.1.128 after Microsoft disclosed the vulnerability through HackerOne on April 29.
Despite multiple layers of built-in security controls, Microsoft found that a determined attacker could potentially manipulate an AI agent into exposing sensitive information.
“We are entering an era where natural language is executable code, and untrusted inputs like GitHub issues must be treated as hostile by default,” it said. “A single, carefully crafted comment combined with a misunderstood trust boundary is all it takes to walk away with production credentials.”
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。