JAILBREAKS News - Blockchain.News

DEEPSEEK

Anthropic Discovers 'Assistant Axis' to Prevent AI Jailbreaks and Persona Drift
deepseek

Anthropic Discovers 'Assistant Axis' to Prevent AI Jailbreaks and Persona Drift

Anthropic researchers map neural 'persona space' in LLMs, finding a key axis that controls AI character stability and blocks harmful behavior patterns.

Trending topics