Architecture
Decentralized Architecture: SaaS Meets Sentinel
SUPERWISE™ Sentinel is built on a decoupled, split-plane architecture. By separating the Control Plane from the Data Plane, SUPERWISE™ allows you to centrally manage AI policies and observe operational telemetry while enforcing security and privacy locally within your own private network perimeter.
The Two Planes
Control Plane (SaaS Cloud)
The SUPERWISE™ SaaS platform acts as the central orchestrator hosted in the cloud. It is responsible for:
- Managing your global Sentinel node registry
- Providing holistic operational, latency, and security remediation reports
- Delivering real-time configurations downstream to active nodes
Data Plane (Customer VPC)
The local Sentinel nodes act as high-performance, stateless proxies running directly where your applications live. They handle 100% of the active traffic interception, real-time guardrail execution, and direct upstream routing to AI models.
Core Technical Features
Containerized Deployment & Scale
The Sentinel is distributed natively as a lightweight Docker Image. For local development and quick staging, it can be spun up in seconds using docker containers. For enterprise, production-scale deployments across high-availability clusters, SUPERWISE™ provides a standardized Helm Chart to easily deploy and auto-scale Sentinels inside Kubernetes.
Supported Providers
Out of the box, the Sentinel engine integrates seamlessly with major foundation model APIs. It maps and translates incoming client payloads to the following covered providers:
- OpenAI API
- Anthropic API
- Gemini API
- Custom Providers (Any internal or alternative model endpoint can be supplied directly to the Sentinel container configuration)
Configuration-Based Routing (Zero Code Changes)
Traffic is redirected to the gateway purely through standard environmental configuration. Because the Sentinel exposes open-standard endpoints, developers do not need to rewrite their code or change their underlying SDK libraries. To instrument any application, you simply modify the LLM client configuration by updating the provider's base_url to point to your running local Sentinel instance.
Request Interception & Proxy Flow
The Sentinel behaves as a transparent reverse proxy. It specifically listens for and intercepts execution traffic targeting the primary text and multi-modal generation endpoints:
| Provider | Intercepted Endpoint Route |
|---|---|
| OpenAI | v1/chat/completions |
| Anthropic | v1/messages |
| Gemini | generateContent |
Based on the provider given, only the intercepted endpoint routes listed above are processed by the proxy. All other routes pass through as usual without interception (for example, the embeddings API for OpenAI or any other non-generation endpoints).
Transparent Payload Processing
When a generation route is triggered:
- The Sentinel extracts and parses the text payload (prompts/messages)
- The extracted text is run through the local Guardrail Stack (Secret Detection, PII Masking, and Toxicity Filtering) for active remediation
- Pass-Through Execution: Aside from the specific text content analyzed and modified by the active guardrails, all other application parameters and metadata are passed as-is to the upstream LLM provider
- The return response from the provider is processed identically (scanning the output for toxicity and unmasking PII) before being passed seamlessly back to the client application with its original structure intact
Data Privacy & Telemetry Sync
SUPERWISE™ is engineered around an absolute privacy boundary.
+-----------------------+ +-------------------------+
| SUPERWISE SaaS Cloud | | Customer VPC |
| (Control Plane) | | (Data Plane) |
+-----------+-----------+ +------------+------------+
| |
| <--- 1. Periodic Heartbeats & Telemetry -- | [Intercepts Traffic]
| --- 2. Guardrail Configuration Pushes ---> | [Enforces Guardrails]
| | [Routes to Providers]
X <========================================== X
[ DATA PRIVACY BOUNDARY ]
NO CONVERSATION DATA LEAKS OUTSIDEDynamic Sync: Running Sentinel nodes establish a secure outbound connection to the Control Plane to send periodic heartbeats and basic operational metadata. In return, they fetch the active guardrail configurations needed to execute locally.
Telemetry Only: Telemetry sent to the cloud is strictly limited to volumetric counts, triggered security violation events, and proxy latency measurements.
Zero Interactivity Leaks: No raw data from your AI interactions is ever transmitted to the cloud. Your prompts, model responses, system instructions, and customer records stay 100% inside your network perimeter.
