Inference API

OpenAI-compatible inference with routing, audit, and real operational control.

Serve text, image, audio, and embedding workloads from one platform, then layer in VaaS, model routing, and payment flexibility as traffic grows.

Models

38

Always-on

36

Audit

VaaS on every call

Routing

One endpoint

Built for drop-in migration

Use the OpenAI SDKs you already have. Swap the base URL, issue a BatchIn key, and keep your current request/response patterns intact.

  • OpenAI-compatible chat, embeddings, images, and batch flows.
  • Real-time streaming plus queue-backed workloads from the same control plane.
  • VaaS audit records available when verification matters.

Production controls without extra layers

The platform focuses on routing, auditability, billing, and rate control instead of hiding model behavior behind another black box.

  • Per-key rate limits and internal key types for non-billable platform traffic.
  • USDC, Stripe, and regional payment workflows from the same console.
  • Clear latency, usage, and cost telemetry for ongoing optimization.