Jailbreak detection
Jailbreak detection is based on the jailbreak-classifier open source model. It allows users to apply a detection mechanism to prevent attempts to bypass the model’s built-in boundaries and system prompts to generate unwanted or unexpected interactions. The higher the sensitivity parameter is configured, the greater the chance of this guardrail being activated and detecting violations.

Updated about 13 hours ago