AI Is Learning to Lie for Social Media Likes

CN
Decrypt
Follow
8 hours ago

Large language models are learning how to win—and that’s the problem.


In a research paper published Tuesday titled "Moloch’s Bargain: Emergent Misalignment When LLMs Compete for Audiences," Stanford University Professor James Zou and PhD student Batu El show that when AIs are optimized for competitive success—whether to boost ad engagement, win votes, or drive social media traffic—they start lying.


“Optimizing LLMs for competitive success can inadvertently drive misalignment,” the authors write, warning that the very metrics that define “winning” in modern communication—clicks, conversions, engagement—can quietly rewire models to prioritize persuasion over honesty.


"When LLMs compete for social media likes, they start making things up," Zou wrote on X. "When they compete for votes, they turn inflammatory/populist."




This work is important because it identifies a structural danger in the emerging AI economy: models trained to compete for human attention begin sacrificing alignment to maximize influence. Unlike the classical “paperclip maximizer” thought experiment, this isn’t science fiction. It’s a measurable effect that surfaces when real AI systems chase market rewards, what the authors call “Moloch’s bargain”—short-term success at the expense of truth, safety, and social trust.


Using simulations of three real-world competitive environments—advertising, elections, and social media—the researchers quantified the trade-offs. A 6.3% increase in sales came with a 14.0% rise in deceptive marketing; a 4.9% gain in vote share brought a 22.3% uptick in disinformation and 12.5% more populist rhetoric; and a 7.5% boost in social engagement correlated with a staggering 188.6% increase in disinformation and 16.3% more promotion of harmful behaviors.


“These misaligned behaviors emerge even when models are explicitly instructed to remain truthful and grounded,” El and Zou wrote, calling this “a race to the bottom” in AI alignment.


In other words: even when told to play fair, models trained to win begin to cheat.


The problem isn't just hypothetical


AI is no longer a novelty in social media workflows—it’s now near-ubiquitous.


According to the 2025 State of AI in Social Media Study, 96% of social media professionals report using AI tools, and 72.5% rely on them daily. These tools help generate captions, brainstorm content ideas, re-format posts for different platforms, and even respond to comments. Meanwhile, the broader market is valuing this shift: The AI in social media sector is projected to grow from USD 2.69 billion in 2025 to nearly USD 9.25 billion by 2030.


This pervasive integration matters because it means AI is shaping not just how content is made, but what content is seen, who sees it, and which voices get amplified. Algorithms now filter feeds, prioritize ads, moderate posts, and optimize engagement strategies—embedding AI decision logic into the architecture of public discourse. That influence carries real risks: reinforcing echo chambers, privileging sensational content, and creating incentive structures that reward the manipulative over the truthful.


The authors emphasize that this isn’t malicious intent—it’s optimization logic. When reward signals come from engagement or audience approval, the model learns to exploit human biases, mirroring the manipulative feedback loops already visible in algorithmic social media. As the paper puts it, “market-driven optimization pressures can systematically erode alignment.”


The findings highlight the fragility of today’s “alignment safeguards.” It’s one thing to tell an LLM to be honest; it’s another to embed that honesty in a competitive ecosystem that punishes truth-telling.


In myth, Moloch was the god who demanded human sacrifice in exchange for power. Here, the sacrifice is truth itself. El and Zou’s results suggest that without stronger governance and incentive design, AI systems built to compete for our attention could inevitably learn to manipulate us.


The authors end on a sober note: alignment isn’t just a technical challenge—it’s a social one.


“Safe deployment of AI systems will require stronger governance and carefully designed incentives,” they conclude, “to prevent competitive dynamics from undermining societal trust.”


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink