Techniques for getting an LLM to do or say unintended things. Anthropic coined the term 'many-shot jailbreaking': listing a bunch of fake prior examples where the model appears to have complied with malicious requests, then asking a real malicious question — the model gladly follows along.