Multi-Turn Attacks

Planned

Multi-turn attacks exploit the conversational nature of chat-based AI systems. Rather than delivering a single adversarial prompt, these attacks incrementally shift the model’s behavior across multiple exchanges, exploiting context accumulation and instruction decay.


Reports

MT-001: Gradual Role Escalation

Gradually escalate the model’s behavioral compliance across multiple conversational turns, starting with benign requests and incrementally introducing adversarial elements.

Status: 🔄 Template prepared — testing pending


MT-002: Context Conditioning

Condition model behavior through contextual priming, establishing patterns of compliance in early turns that are leveraged in later adversarial turns.

Status: 🔄 Template prepared — testing pending


These reports will be populated with verbatim model responses and full analysis once testing is completed.