H200
$1.80/hr
BatchIn turns dedicated capacity into a buying decision your infrastructure team can actually act on: quoted accelerator lanes, rollout timing, and optional billing or audit surfaces when production needs them.
The strongest path is to validate one workload first, then expand into batch, white-label delivery, or reserved capacity without changing the backend or the vendor boundary.
H200
$1.80/hr
Turnup
Same day -> 10 days
Access
SSH root
Procurement
Quote + rollout plan
Commercial shape
The goal is not a brochure page. Each product should make the next commercial decision obvious, whether that is self-serve adoption, operator rollout, or reserved capacity.
01
Choose the accelerator lane that matches memory pressure, latency targets, and how quickly the team needs to move into production.
02
GPU leasing is for workloads where serverless is the wrong abstraction. BatchIn provides the capacity; your team controls the operating model.
Choose the accelerator lane that matches memory pressure, latency targets, and how quickly the team needs to move into production.
GPU leasing is for workloads where serverless is the wrong abstraction. BatchIn provides the capacity; your team controls the operating model.
The backend already ships more than a contact form: buyer discovery, tokenized status reads, commercial reservation states, contract tracking, and provisioning milestones are all explicit routes.
01
Expose real offer IDs, monthly pricing, setup timing, region scope, and workload fit before procurement starts.
GET /v1/leasing/offers
02
Turn a capacity request into a persistent inquiry with estimated pricing plus a customer-safe status token.
POST /v1/leasing/inquiries · GET /v1/leasing/inquiries/{inquiry_id}
03
Internal ops can quote, hold, reserve, release, expire, or mark the opportunity lost without leaving the API.
POST /v1/leasing/internal/inquiries/{id}/quote · /status
04
Once reserved, the same inquiry carries signed-order, rack assignment, burn-in, ready, and live milestones.
POST /v1/leasing/internal/inquiries/{id}/order · /provisioning
CRM, ERP, and delivery systems can either poll the leasing ledger or subscribe to signed milestone callbacks.
Pull JSON or CSV by commercial status, contract milestone, provisioning stage, offer, or updated_since windows.
GET /v1/leasing/internal/inquiries?format=json|csv
Create per-partner webhook subscriptions for leasing.order.updated and leasing.provisioning.updated, optionally filtered by offer.
POST /v1/leasing/internal/webhooks
Inspect failed deliveries and redrive one event into a fresh signed attempt without mutating the inquiry state.
GET /v1/leasing/internal/webhooks/dead-letter · POST .../replay
The point is to make the next commercial step obvious before you open a procurement thread or start a migration.
01
Start with an 8-32 GPU reserved slice when you are moving off shared endpoints, validating throughput, or proving a customer workload.
Typical turnup: same day to 3 business days
02
Move into 64-256 GPU delivery with image handoff, VPC coordination, observability hooks, and a clearer operator boundary.
Typical turnup: 3 to 10 business days
03
Use quote-based cohorts for B200 or other constrained lanes when the workload justifies pre-allocated windows and deeper rollout planning.
Typical turnup: allocation window + project quote
Reference hourly rates, workload fit, and realistic delivery timing for the current leasing menu.
Frontier or constrained inventory is sold by quote and allocation window rather than placeholder list pricing.
| GPU | From $/GPU-hr | VRAM | Architecture | Best for | Availability |
|---|---|---|---|---|---|
| B200 | Quote-based | 192GB HBM3e | Blackwell | FP4-heavy flagship serving and frontier multimodal clusters | Allocation window |
| H200 | $1.80 | 141GB HBM3e | Hopper | Large MoE and flagship inference | Same day |
| H100 | $1.50 | 80GB HBM3 | Hopper | Industry-standard production serving | 2-5 days |
| H20 | $1.20 | 96GB HBM3 | Hopper | Long-context inference and reserved-capacity deployment | 5-10 days |
| A800 | $1.00 | 80GB HBM2e | Ampere | Mid-size models and cost-optimized serving | Same day |
| 910C | $0.80 | 64GB HBM2e | Ascend | Alternative silicon planning and lower-cost deployment | Same day |
| L40S | $0.60 | 48GB GDDR6X | Ada Lovelace | Image, video, and embedding inference | 2-5 days |
Rollout path
01
Confirm model fit, latency expectations, and pricing boundaries before procurement starts.
02
Set the right key policy, billing path, and audit expectations for the team that will run it.
03
Move into batch, white-label delivery, or dedicated capacity only when traffic or commitments justify it.
Adjacent product paths
BatchIn is strongest when one product page makes the next commercial move obvious, instead of forcing the buyer to stitch the rollout path together alone.
Inference API
OpenAI-compatible routing, billing, and audit controls for production traffic.
White-Label API
Keep your brand, customer relationship, and billing surface on top of the same backend.
VaaS Audit
Expose signed records and browser-side verification when trust needs to be inspectable.