Architecture

SUPERWISE® Sentinel uses a decoupled, split-plane architecture. You manage policies and view telemetry from a central control plane, while security and privacy are enforced locally — inside your own network perimeter.

The two planes

Control plane (SaaS cloud)

The SUPERWISE® SaaS platform acts as the central orchestrator hosted in the cloud. It is responsible for:

Managing your global Sentinel gateway registry
Providing holistic operational, latency, and security remediation reports
Serving each gateway's guardrail configuration, which gateways pull over their outbound heartbeat connection

Data plane (customer VPC)

The local Sentinel gateways act as high-performance, stateless reverse proxies running directly where your applications live. They handle 100% of the active traffic interception, real-time guardrail execution, and direct upstream routing to AI models.

Core technical features

Containerized deployment & scale

The Sentinel is distributed natively as a lightweight Docker image. For local development and quick staging, it can be spun up in seconds using Docker containers. For enterprise, production-scale deployments across high-availability clusters, SUPERWISE® provides a standardized Helm chart to deploy and auto-scale Sentinels inside Kubernetes. See Deploy a gateway.

Supported providers

Out of the box, the Sentinel engine integrates with major foundation model APIs. It maps and translates incoming client payloads to the following covered providers:

OpenAI API
Anthropic API
Gemini API
Custom providers — any OpenAI-compatible endpoint (see Custom providers)

Configuration-based routing (zero code changes)

Traffic is redirected to the gateway purely through standard environment configuration. Because the Sentinel exposes open-standard endpoints, developers do not need to rewrite their code or change their underlying SDK libraries. To instrument any application, you simply update the provider's base URL to point to your running Sentinel instance. See Connect your applications.

Request interception & proxy flow

The gateway is a reverse proxy: your applications send their LLM traffic to it, and it forwards each request upstream. It sits only in your LLM path — not in front of all your traffic — and intercepts the primary text-generation routes:

Provider	Intercepted endpoint route
OpenAI	v1/chat/completions
Anthropic	v1/messages
Gemini	generateContent

Based on the provider given, only the intercepted endpoint routes listed above are processed by the proxy. All other routes pass through as usual without interception (for example, the embeddings API for OpenAI or any other non-generation endpoint).

Transparent payload processing

When a generation route is triggered:

The Sentinel extracts and parses the text payload (prompts/messages)
The extracted text is run through the local guardrail stack for active remediation
Pass-through execution: Aside from the specific text content analyzed and modified by the active guardrails, all other application parameters and metadata (including non-text and multi-modal fields) are passed as-is to the upstream LLM provider
Response pass-through: The upstream provider's response is streamed back to the client application unmodified, with its original structure intact. Guardrails currently run on inbound prompts only.

For how data is kept inside your perimeter, see Data privacy.