HomeGuidesAPI ReferenceRelease notes
Log In
Guides

Architecture

Decentralized Architecture: SaaS Meets Sentinel

SUPERWISE™ Sentinel is built on a decoupled, split-plane architecture. By separating the Control Plane from the Data Plane, SUPERWISE™ allows you to centrally manage AI policies and observe operational telemetry while enforcing security and privacy locally within your own private network perimeter.

The Two Planes

Control Plane (SaaS Cloud)

The SUPERWISE™ SaaS platform acts as the central orchestrator hosted in the cloud. It is responsible for:

  • Managing your global Sentinel node registry
  • Providing holistic operational, latency, and security remediation reports
  • Delivering real-time configurations downstream to active nodes

Data Plane (Customer VPC)

The local Sentinel nodes act as high-performance, stateless proxies running directly where your applications live. They handle 100% of the active traffic interception, real-time guardrail execution, and direct upstream routing to AI models.

Core Technical Features

Containerized Deployment & Scale

The Sentinel is distributed natively as a lightweight Docker Image. For local development and quick staging, it can be spun up in seconds using docker containers. For enterprise, production-scale deployments across high-availability clusters, SUPERWISE™ provides a standardized Helm Chart to easily deploy and auto-scale Sentinels inside Kubernetes.

Supported Providers

Out of the box, the Sentinel engine integrates seamlessly with major foundation model APIs. It maps and translates incoming client payloads to the following covered providers:

  • OpenAI API
  • Anthropic API
  • Gemini API
  • Custom Providers (Any internal or alternative model endpoint can be supplied directly to the Sentinel container configuration)

Configuration-Based Routing (Zero Code Changes)

Traffic is redirected to the gateway purely through standard environmental configuration. Because the Sentinel exposes open-standard endpoints, developers do not need to rewrite their code or change their underlying SDK libraries. To instrument any application, you simply modify the LLM client configuration by updating the provider's base_url to point to your running local Sentinel instance.

Request Interception & Proxy Flow

The Sentinel behaves as a transparent reverse proxy. It specifically listens for and intercepts execution traffic targeting the primary text and multi-modal generation endpoints:

ProviderIntercepted Endpoint Route
OpenAIv1/chat/completions
Anthropicv1/messages
GeminigenerateContent

Based on the provider given, only the intercepted endpoint routes listed above are processed by the proxy. All other routes pass through as usual without interception (for example, the embeddings API for OpenAI or any other non-generation endpoints).

Transparent Payload Processing

When a generation route is triggered:

  1. The Sentinel extracts and parses the text payload (prompts/messages)
  2. The extracted text is run through the local Guardrail Stack (Secret Detection, PII Masking, and Toxicity Filtering) for active remediation
  3. Pass-Through Execution: Aside from the specific text content analyzed and modified by the active guardrails, all other application parameters and metadata are passed as-is to the upstream LLM provider
  4. The return response from the provider is processed identically (scanning the output for toxicity and unmasking PII) before being passed seamlessly back to the client application with its original structure intact

Data Privacy & Telemetry Sync

SUPERWISE™ is engineered around an absolute privacy boundary.

+-----------------------+                    +-------------------------+
| SUPERWISE SaaS Cloud  |                    |      Customer VPC       |
|    (Control Plane)    |                    |      (Data Plane)       |
+-----------+-----------+                    +------------+------------+
            |                                             |
            |  <--- 1. Periodic Heartbeats & Telemetry -- |  [Intercepts Traffic]
            |  --- 2. Guardrail Configuration Pushes ---> |  [Enforces Guardrails]
            |                                             |  [Routes to Providers]
            X <========================================== X  
                      [ DATA PRIVACY BOUNDARY ]
                 NO CONVERSATION DATA LEAKS OUTSIDE

Dynamic Sync: Running Sentinel nodes establish a secure outbound connection to the Control Plane to send periodic heartbeats and basic operational metadata. In return, they fetch the active guardrail configurations needed to execute locally.

Telemetry Only: Telemetry sent to the cloud is strictly limited to volumetric counts, triggered security violation events, and proxy latency measurements.

Zero Interactivity Leaks: No raw data from your AI interactions is ever transmitted to the cloud. Your prompts, model responses, system instructions, and customer records stay 100% inside your network perimeter.