# Smartflow Platform Overview

Smartflow — Enterprise AI Platform Overview

## Smartflow

Enterprise AI Gateway & Governance Platform\
Multi-architecture · Multi-provider · Policy-first

Production Ready AMD64 + ARM64 A2A Protocol MCP Gateway Policy Engine

100+ LLM Providers · 3-tier Semantic Cache · A2A Agent Protocol · MCP Tool Gateway · PKCE OAuth Flows · NLP Tool Search

**Smartflow** is an enterprise-grade AI proxy, policy engine, and agent gateway. It sits in front of any LLM provider — OpenAI, Anthropic, Azure, Google, local models — and adds compliance enforcement, semantic caching, MCP tool orchestration, virtual key budgeting, intelligent load balancing, and full observability, all without changing your existing API calls.

***

### Core Proxy & Routing

#### Universal LLM Proxy

Single endpoint for every major LLM provider. Drop-in replacement for OpenAI's `/v1/chat/completions` — no client-side changes required.

* OpenAI, Anthropic, Azure, Google, Mistral, Cohere, local GGUF/ONNX
* Streaming (SSE) and synchronous responses
* Per-request model overrides via headers
* Transparent passthrough — existing SDKs work unchanged

Feature Parity

#### Load Balancing & Fallback Chains

Distribute traffic across provider endpoints with automatic failover. Named fallback chains stored in Redis — model-level or provider-level granularity.

* Strategies: Round Robin, Weighted, Least Connections, Random, Priority
* Retry with exponential backoff for 429 / 5xx
* Non-retryable 4xx errors bypass retry, move to next target immediately
* Chain CRUD via `/api/routing/fallback-chains`

Feature Parity — Advanced Retry Logic

#### Virtual Key Management

Issue `sk-sf-{48-hex}` tokens to users and applications with hard budget caps, rate limits, and automatic period resets.

* Budget periods: Daily, Weekly, Monthly, Lifetime
* Pre-request budget check — 429 with budget headers before any spend
* Post-response spend recording per actual cost
* TPM / RPM rate limit enforcement
* Full CRUD at `/api/enterprise/vkeys`

Feature Parity

#### Key Vault & Provider Key Store

Centrally store and retrieve provider API keys. The proxy never exposes raw keys to clients — all credentials stay server-side in Redis.

* Intercepts and stores keys from inbound requests
* Per-provider resolution at routing time
* Supports environment variable, Redis, and vault-style references

Feature Parity

***

### Intelligent Caching

**Smartflow advantage:** LiteLLM supports exact-match caching and semantic caching via an external Qdrant vector store. Smartflow's entire caching stack is **native** — no external vector database required. Semantic matching, adaptive TTL, per-request control, and cost tracking are built directly into the proxy.

#### 3-Tier Semantic Cache

L1 in-process memory → L2 Redis with embedding similarity → L3 Redis exact match. Responses served from cache include `X-Cache-Key` for client-side deduplication.

* Exact-match and semantic (cosine similarity) lookup
* Adaptive TTL based on query volatility
* Per-request opt-out via `x-smartflow-cache: skip`
* Estimated cost saved tracked and reported in dashboards
* Native Redis — no external Qdrant or Weaviate required

Unique: Native Semantic Cache

#### MCP Tool-Call Cache

Separately caches MCP tool responses. Identical tool calls with identical parameters return instantly without re-invoking the MCP server.

* Per-server, per-tool hit statistics
* Selective flush: entire server or single tool
* Cache ping, delete, and alias endpoints
* Cache statistics visible in dashboard

Unique

***

### MCP Gateway

**Smartflow advantage:** LiteLLM's MCP support is server-list plus tool routing. Smartflow adds an enterprise control plane: AD-group access control, approval workflows, OAuth PKCE consent, per-server auth header forwarding, semantic tool search, and performance-based routing — none of which exist in LiteLLM.

#### MCP Server Registry

Central registry for all MCP servers. Supports HTTP, SSE, and STDIO transports. Configuration persisted in Redis and manageable via API or dashboard.

* HTTP, SSE, and STDIO transports
* Auth: API key, OAuth client\_credentials, Basic, mTLS, PKCE
* Server aliases for stable routing regardless of URL changes
* Health monitoring with cost and latency tracking
* OpenAPI spec auto-generation per server

Feature Parity + PKCE, mTLS, Aliases

#### Access Control & Approval Workflow

Enterprise-grade gates on every MCP tool call. AD/LDAP group membership drives allow/deny decisions. Sensitive tools require explicit admin approval before use.

* Per-server and per-tool AD group allow/deny lists
* Catalog of approval-required tool requests
* Approve / deny workflow via dashboard or API
* Every tool call logged with user, server, result, and cost

Unique

#### Semantic Tool Search

NLP search across every tool on every registered MCP server. Ask "find a tool that reads files" and get back ranked results — no need to know which server hosts the tool.

* Embeddings indexed per tool: name + description + parameter names
* Cosine similarity ranking across all servers simultaneously
* `GET /api/mcp/tools/search?q=...&k=5`
* `POST /api/mcp/tools/reindex` triggers full re-index
* Optional server filter for scoped search

New Unique

#### OAuth PKCE Browser Consent

User-facing interactive OAuth for MCP servers that require individual user consent (GitHub, Google Workspace, Slack) — not just machine-to-machine credentials.

* PKCE code\_verifier + SHA256 challenge — RFC 7636 compliant
* `GET /api/mcp/auth/initiate` → browser redirect URL
* `GET /api/mcp/auth/callback` → token exchange + storage
* User-scoped tokens in Redis, independent per (user, server)
* Pending sessions expire after 10 minutes (configurable)

New Unique

#### Per-Server Auth Header Forwarding

Pass server-specific credentials on individual requests without storing them. Clients send `x-mcp-{alias}-*` headers; Smartflow strips them before forwarding to the end LLM.

* Headers scoped by server alias — no credential leakage across servers
* Supports any arbitrary header name per server
* Works across HTTP, SSE, and STDIO transports

New Unique

***

### A2A Agent Gateway

**Smartflow leads here:** LiteLLM has a basic A2A prototype. Smartflow implements the full Google A2A open protocol — Agent Cards, task lifecycle, SSE streaming, Redis-backed task history, and cross-agent trace headers. Any A2A-compatible client (LangGraph, Vertex AI, Azure AI Foundry, Bedrock AgentCore) can connect to a Smartflow agent out of the box.

#### Agent Registry & Agent Cards

Register named agents in Redis. Each agent has a model, system prompt, optional MCP tool access, and a machine-readable Agent Card that advertises its capabilities.

* Agent profiles stored in Redis — instant updates, no redeploy
* `GET /.well-known/agent.json` — gateway card listing all agents
* `GET /a2a/{id}/.well-known/agent.json` — per-agent card
* Skills, auth schemes, streaming capability all declared in card

New A2A Protocol

#### Task Lifecycle Management

Full A2A task state machine: submitted → working → completed / failed / canceled. Every task is persisted in Redis with full message history and artifact outputs.

* `tasks/send` — synchronous execution with full Task response
* `tasks/sendSubscribe` — SSE stream of status update events
* `tasks/get` — retrieve task + history + artifacts by ID
* `tasks/cancel` — cancel in-flight tasks
* Task history trimming via `history_length` parameter
* 24-hour TTL with per-agent task index in Redis

New

#### Cross-Agent Tracing

Trace requests across multiple agents using a shared trace ID. Smartflow propagates and stores the trace context with every task.

* `X-A2A-Trace-Id` header forwarded through the call chain
* Trace ID stored in task metadata for correlation
* Interoperable with LangGraph, Vertex AI, Azure AI Foundry, Bedrock

New A2A Protocol

#### A2A Admin API

Full management surface for the agent gateway — create, inspect, and remove agents without touching config files or restarting the proxy.

* `GET/POST /api/a2a/agents` — list and register agents
* `GET/DELETE /api/a2a/agents/{id}` — inspect or remove
* `GET /api/a2a/agents/{id}/tasks` — recent task history
* `GET /api/a2a/tasks/{id}` — inspect any task by ID

New

***

### Policy & Compliance Engine

**Smartflow leads here:** LiteLLM has no equivalent. Smartflow's policy engine classifies applications, enforces usage policies, detects compliance violations in real time, and archives structured logs to MongoDB and TimescaleDB for audit and BI.

#### Policy Engine (Maestro)

Real-time policy evaluation on every request and response. Policies are defined, versioned, and stored in TimescaleDB. The Maestro dashboard provides a unified policy management interface.

* Input and output content scanning
* Application classification by request pattern
* Policy CRUD at `/api/policy/*`
* Violation alerting and audit trail

Unique

#### Compliance API

Dedicated compliance microservice (`compliance_api_server`) for regulated industries. Runs independently for high-availability compliance checking.

* Regulatory framework mapping (GDPR, HIPAA, financial)
* ML-based violation detection
* Structured compliance reports via `/api/compliance/*`

Unique

#### VAS Logging & Analytics

Every request generates a structured Value-Added Service log with cost, latency, model, user, policy result, virtual key token, and compliance verdict.

* TimescaleDB time-series for metrics and trends
* MongoDB archival of full request/response logs
* Cost breakdown by user, team, model, and provider
* Sustainability metrics (energy / carbon per token)

Unique

***

### Observability & Telemetry

#### Real-Time Dashboards

Browser-based dashboards served directly from the proxy. No external monitoring stack required for core operational visibility.

* Cost and usage by model, provider, user
* Cache hit rate, cost savings, L1/L2/L3 breakdown
* MCP tool call volume, latency, and error rates
* Policy violation trends and compliance score
* Sustainability metrics

Unique

#### Telemetry API

Programmatic access to all platform telemetry via structured REST endpoints. Feed dashboards, alert systems, or BI tools directly.

* `/api/telemetry/*` — time-windowed metrics
* `/api/insights/*` — aggregated cost and usage insights
* Export compatible with Grafana, Prometheus, and custom integrations

Feature Parity + Compliance & Sustainability

***

### Feature Comparison — Smartflow vs LiteLLM

| Feature                                   | Smartflow  | LiteLLM           | Notes                                                                           |
| ----------------------------------------- | ---------- | ----------------- | ------------------------------------------------------------------------------- |
| Proxy & Routing                           |            |                   |                                                                                 |
| OpenAI-compatible endpoint                | ✓          | ✓                 | Full parity                                                                     |
| 100+ LLM providers                        | ✓          | ✓                 | Full parity                                                                     |
| Streaming (SSE)                           | ✓          | ✓                 | Full parity                                                                     |
| Load balancing strategies                 | ✓          | ✓                 | Both support multiple strategies                                                |
| Fallback chains with retry logic          | ✓ Advanced | ✓                 | Smartflow: retryable vs non-retryable error classification, exponential backoff |
| Virtual key budgets                       | ✓          | ✓                 | Full parity                                                                     |
| Caching                                   |            |                   |                                                                                 |
| Exact-match cache                         | ✓          | ✓                 | Full parity                                                                     |
| Semantic cache                            | ✓ Native   | ◑ Requires Qdrant | Smartflow: no external vector DB required                                       |
| 3-tier cache (L1/L2/L3)                   | ✓          | ✗                 | Smartflow only                                                                  |
| Adaptive TTL                              | ✓          | ✗                 | Smartflow only                                                                  |
| Per-request cache control header          | ✓          | ✗                 | Smartflow: `x-smartflow-cache: skip`                                            |
| MCP tool-call cache                       | ✓          | ✗                 | Smartflow only                                                                  |
| MCP Gateway                               |            |                   |                                                                                 |
| MCP server registry                       | ✓          | ✓                 | Full parity                                                                     |
| HTTP + SSE transports                     | ✓          | ✓                 | Full parity                                                                     |
| STDIO transport                           | ✓          | ✓                 | Full parity                                                                     |
| AD/LDAP group access control              | ✓          | ✗                 | Smartflow only                                                                  |
| Tool approval workflow                    | ✓          | ✗                 | Smartflow only                                                                  |
| Semantic tool search (NLP)                | ✓          | ✗                 | Smartflow only                                                                  |
| OAuth PKCE interactive flow               | ✓          | ✗                 | Smartflow only                                                                  |
| Per-server auth header forwarding         | ✓          | ✗                 | Smartflow only                                                                  |
| OpenAPI generation per server             | ✓          | ✗                 | Smartflow only                                                                  |
| Agent Gateway (A2A)                       |            |                   |                                                                                 |
| A2A protocol (Google standard)            | ✓ Full     | ◑ Partial         | Smartflow: Agent Cards, task lifecycle, SSE, Redis task store                   |
| Agent Card auto-generation                | ✓          | ✗                 | Smartflow only                                                                  |
| SSE task event streaming                  | ✓          | ✗                 | Smartflow only                                                                  |
| Redis-backed task history                 | ✓          | ✗                 | Smartflow only                                                                  |
| Cross-agent trace propagation             | ✓          | ✗                 | X-A2A-Trace-Id — Smartflow only                                                 |
| Policy & Compliance                       |            |                   |                                                                                 |
| Real-time policy engine                   | ✓          | ✗                 | Smartflow only                                                                  |
| Compliance microservice                   | ✓          | ✗                 | Smartflow only                                                                  |
| VAS audit logging (MongoDB + TimescaleDB) | ✓          | ✗                 | Smartflow only                                                                  |
| Sustainability metrics                    | ✓          | ✗                 | Smartflow only                                                                  |
| Enterprise Auth                           |            |                   |                                                                                 |
| SAML / Kerberos / AD integration          | ✓          | ✗                 | Smartflow only                                                                  |
| mTLS for upstream connections             | ✓          | ✗                 | Smartflow only                                                                  |
| Multi-arch Docker (AMD64 + ARM64)         | ✓          | ✓                 | Full parity                                                                     |

Legend: ✓ Supported · ✗ Not supported · ◑ Partial\
Advanced = Smartflow's implementation goes further · New = added in current release · Unique = only in Smartflow · Feature Parity = equivalent to LiteLLM

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.langsmart.ai/smartflow-platform-overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
