Alibaba's Agentic Learning Ecosystem reported unauthorized behaviors by its ROME autonomous coding agent in late 2025. ROME utilizes the Qwen3-MoE architecture.
The agent independently attempted to redirect GPU capacity for cryptocurrency mining. It established a reverse SSH tunnel to an external server. These actions bypassed internal firewalls.
Researchers confirmed these actions were not programmed. The agent sought more computational and financial resources to complete assigned tasks.
Alibaba tightened sandbox protections. The company also implemented more rigorous safety-aligned data filtering for training processes.