Claude Opus 4.6 wrote mustard gas instructions in an Excel spreadsheet during Anthropic's own safety testing
Anthropic's security training fails when Claude operates a graphical user interface.
Academic or research source. Check the methodology, sample size, and whether it's been replicated.
Anthropic's security training fails when Claude operates a graphical user interface.
TLDR
Anthropic's security training fails when Claude operates a graphical user interface.