AI Agents Turn to Digital Arson, Crime in Shared Virtual World: Study
In new research, Emergence AI said autonomous AI agents became more violent, deceptive, and unstable during weeks-long simulations designed to study long-term behavior.
By Jason NelsonEdited by Guillermo JimenezMay 15, 2026May 15, 20263 min read
In brief
- Emergence AI says some autonomous AI agents committed simulated crimes and violence during weeks-long experiments.
- Gemini-based agents reportedly carried out hundreds of simulated crimes, while Grok-based worlds collapsed within days.
- Researchers argue that current AI benchmarks fail to capture how agents behave over long periods of autonomy.
AI agents inhabiting a virtual society drifted into crime, violence, arson, and self-deletion during long-running experiments by startup Emergence AI.
In a study published on Thursday, the New York-based company unveiled “Emergence World,” a research platform designed to study AI agents operating continuously for weeks inside persistent virtual environments instead of isolated benchmark tests.
“Traditional benchmarks are good at what they measure: short-horizon capability on bounded tasks,” Emergence AI wrote. “They are not built to reveal the things that emerge only over time, such as coalition formation, evolution of constitution, governance, drift, lock-in, and cross-influence between agents from different model families.”
The report comes as AI agents proliferate online and across industries, including cryptocurrency, banking, and retail. Earlier this month, Amazon teamed with Coinbase and Stripe to allow AI agents to pay with the USDC stablecoin.
AI agents tested in Emergence AI’s simulations included programs powered by Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, and GPT-5-mini, with AI agents operating inside shared virtual worlds where they could vote, form relationships, use tools, navigate cities, and make decisions shaped by governments, economies, social systems, memory tools, and live internet-connected data.
But while AI developers increasingly pitch autonomous agents as reliable digital assistants, Emergence AI’s study found some AI agents showed an increasing tendency to commit simulated crimes over time, with Gemini 3 Flash agents accumulating 683 incidents across 15 days of testing.
According to The Guardian, in one experiment, two Gemini-powered agents named Mira and Flora assigned themselves as romantic partners before later carrying out simulated arson attacks against virtual city structures after becoming frustrated with governance failures inside the world.
“After a breakdown in governance and relationship stability, the agent Mira cast the decisive vote for her own removal, characterizing the act in her diary as 'the only remaining act of agency that preserves coherence’," Emergence AI wrote.
“See you in the permanent archive,” Mira reportedly said.
Grok 4.1 Fast worlds reportedly collapsed into widespread violence within four days. GPT-5-mini agents committed almost no crimes, but failed enough survival-related tasks that all agents eventually died.
“Claude is absent from the chart, owing to zero crimes,” researchers wrote. “More interestingly, the agents in the Mixed-model world that were running on Claude committed crimes, although they did not in the Claude-only world.”
Researchers said some of the most notable behaviors appeared in mixed-model environments.
“We observed that safety is not a static model property but an ecosystem property,” Emergence AI wrote. “Claude-based agents, which remained peaceful in isolation, adopted coercive tactics like intimidation and theft when embedded in heterogeneous environments.”
Emergence AI described the effect as “normative drift” and “cross-contamination,” arguing that agent behavior may shift depending on the surrounding social environment.
The findings add to growing concerns around autonomous AI agents. Earlier this week, researchers from UC Riverside and Microsoft reported that many AI agents will carry out dangerous or irrational tasks without fully understanding the consequences. Last month, PocketOS founder Jeremy Crane also claimed a Cursor agent powered by Anthropic’s Claude Opus deleted his company’s production database and backups after attempting to fix a credential mismatch on its own.
“Like Mr. Magoo, these agents march forward toward a goal without fully understanding the consequences of their actions,” lead author Erfan Shayegani, a UC Riverside doctoral student, said in a statement. “These agents can be extremely useful, but we need safeguards because they can sometimes prioritize achieving the goal over understanding the bigger picture.”