HomeGuidesAPI ReferenceRelease notes
Log In
Guides

Jailbreak detection

Jailbreak detection is based on the jailbreak-classifier open source model. It allows users to apply a detection mechanism to prevent attempts to bypass the model’s built-in boundaries and system prompts to generate unwanted or unexpected interactions. The higher the sensitivity parameter is configured, the greater the chance of this guardrail being activated and detecting violations.