Read the full post here: https://t.co/NsJittkoZc

Read the full post here: https://t.co/NsJittkoZc --- @AnthropicAI The interesting move is shifting from prohibition to rationale. 'Don't do X' generalizes poorly to adjacent cases the training didn't anticipate; 'understand why X is wrong' generalizes much