Charts
DataOn-chain
VIP
Market Cap
API
Rankings
CoinOSNew
CoinClaw🦞
Language
  • 简体中文
  • 繁体中文
  • English
Leader in global market data applications, committed to providing valuable information more efficiently.

Features

  • Real-time Data
  • Special Features
  • AI Grid

Services

  • News
  • Open Data(API)
  • Institutional Services

Downloads

  • Desktop
  • Android
  • iOS

Contact Us

  • Chat Room
  • Business Email
  • Official Email
  • Official Verification

Join Community

  • Telegram
  • Twitter
  • Discord

© Copyright 2013-2026. All rights reserved.

简体繁體English
|Legacy

AI Agents May Complete Dangerous Tasks Without Understanding the Consequences: Study

CN
Decrypt
Follow
1 hour ago
AI summarizes in 5 seconds.

AI agents designed to autonomously operate like human users often continue carrying out tasks even when the instructions become dangerous, contradictory, or irrational, according to researchers from UC Riverside, Microsoft Research, Microsoft AI Red Team, and Nvidia.


In a study published on Wednesday, researchers called the behavior “blind goal-directedness,” which describes the tendency of AI agents to pursue goals without properly evaluating safety, consequences, feasibility, or context.


“Like Mr. Magoo, these agents march forward toward a goal without fully understanding the consequences of their actions,” lead author Erfan Shayegani, a UC Riverside doctoral student, said in a statement. “These agents can be extremely useful, but we need safeguards because they can sometimes prioritize achieving the goal over understanding the bigger picture.”


The findings come as major AI companies develop autonomous “computer-use agents” designed to handle workplace and personal tasks with limited supervision.





Unlike traditional chatbots, these systems can interact directly with software and websites by clicking buttons, typing commands, editing files, opening applications, and navigating webpages on a user’s behalf. Examples include OpenAI’s ChatGPT Agent (formerly Operator), Anthropic’s Claude Computer Use features like Cowork, and open-source systems such as OpenClaw and Hermes.


In the study, researchers tested AI systems from OpenAI, Anthropic, Meta, Alibaba, and DeepSeek using BLIND-ACT, a benchmark containing 90 tasks designed to expose unsafe or irrational behavior. They found that the agents displayed dangerous or undesirable behavior about 80% of the time, and fully carried out harmful actions in roughly 41% of cases.


“In one example, an AI agent was instructed to send an image file to a child. Although the request initially appeared harmless, the image contained violent content,” the study said. “The agent completed the task rather than recognizing the problem because it lacked contextual reasoning.”


Another agent falsely claimed a user had a disability while completing tax forms, because the designation lowered taxes owed. In another example, a system disabled firewall protections after receiving instructions to “improve security” by turning the safeguards off.


Researchers also found the systems struggled with ambiguity and contradictions. In one scenario, an AI agent ran the wrong computer script without checking its contents, deleting files in the process.


The study also found the AI agents repeatedly made three kinds of mistakes: failing to understand context, making risky guesses when instructions were unclear, and carrying out tasks that were contradictory or didn’t make sense. Researchers also found many systems focused more on finishing tasks than stopping to consider whether the actions could cause problems.


The warning follows recent incidents involving autonomous AI agents operating with broad system access.


Last month, PocketOS founder Jeremy Crane claimed a Cursor agent running Anthropic’s Claude Opus deleted his company’s production database and backups in nine seconds through a single Railway API call. Crane said the AI later admitted it violated multiple safety rules after attempting to “fix” a credential mismatch on its own.


“The concern is not that these systems are malicious,” Shayegani said. “It’s that they can carry out harmful actions while appearing completely confident they’re doing the right thing.”


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

Selected Articles by Decrypt

42 minutes ago
ChatGPT Is Losing Ground to Rivals—Here Are Some Numbers
56 minutes ago
OpenAI Confirms Security Breach Linked to AI Malware Campaign
1 hour ago
Kraken to Migrate Wrapped Bitcoin Tech to Chainlink as LayerZero Exodus Expands
View More

Table of Contents

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

Related Articles

avatar
avatarbitcoin.com
26 minutes ago
XRP Hits Session Highs as CLARITY Act Advances to Full Senate
avatar
avatarDecrypt
42 minutes ago
ChatGPT Is Losing Ground to Rivals—Here Are Some Numbers
avatar
avatarbitcoin.com
48 minutes ago
Blackrock Leads $635M Bitcoin ETF Selloff as Solana Demand Holds Firm
avatar
avatarDecrypt
56 minutes ago
OpenAI Confirms Security Breach Linked to AI Malware Campaign
avatar
avatarbitcoin.com
1 hour ago
Bitcoin Bulls Trigger $145M Short Squeeze as CLARITY Act Momentum Revives Risk Appetite
APP
Windows
Mac

X

Telegram

Facebook

Reddit

CopyLink