GPT-OSS-120B

OpenAI open-weight MoE with pragmatic pricing for general chat, agents, and product workflows.

Public model detailMoE Transformer

Params

120B MoE

Context

131K

Max Output

32K

License

Apache 2.0

TTFT

310ms

Throughput

72 tok/s

Why pick it

Open-weight OpenAI family model
Cheap enough for broad deployment

Pricing

TierStandardCachedSiliconFlowSavings

Realtime$0.02 / $0.15$0.007$0.05 / $0.4560%

Batch$0.01 / $0.07$0.007$0.05 / $0.4560%

Quick start

OpenAI-compatible surface. Swap the base URL and ship.

Try in Playground Open pricing

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.luminapath.tech/v1",
    api_key="BATCHIN_API_KEY"
)

resp = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Summarize why this model is a fit for my workload."}]
)

print(resp.choices[0].message.content)

JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.luminapath.tech/v1",
  apiKey: process.env.BATCHIN_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: "Summarize why this model is a fit for my workload." }],
});

console.log(resp.choices[0]?.message?.content);

cURL

curl https://api.luminapath.tech/v1/chat/completions \
  -H "Authorization: Bearer $BATCHIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "messages": [{"role":"user","content":"Summarize why this model is a fit for my workload."}]
  }'

Specs

Architecture

MoE Transformer

Vendor group

OpenAI

Context window

131K

Max output

32K

Best for

open-source

general