Baidu

ernie-4.5-300b

ERNIE 4.5-300B

Baidu flagship route for broad bilingual enterprise workloads.

Public model detailMoE Transformer

Params

300B / 47B active

Context

131K

Max Output

32K

License

Baidu

TTFT

N/A

No 5m benchmark sample

5m RPM

N/A

Why pick it

Strong bilingual enterprise fit
Competitive BatchIn price target

Pricing

TierStandardCachedSiliconFlowSavings

Realtime$0.10 / $0.38$0.035N/AN/A

Batch$0.050 / $0.190$0.035N/AN/A

Production pricing proof

How this route settles on a real request

When the model executes live, response headers can expose X-BatchIn-Provider, X-BatchIn-Route-Reason, X-BatchIn-Effective-Cost-Cents, and X-BatchIn-Uncached-Cost-Cents.

The cached price here means prompt-cache discount on input tokens. Durable response-cache hits are proven separately through X-BatchIn-Response-Cache-Mode and request lookup.

Streaming calls start with X-Request-Id, then resolve final cost, cache mode, and route truth through lookup after completion.

Open request lookup Open trust hub

Route source of truth

See pricing, request proof, and the upgrade path on one page

Standard, prompt-cache, batch, and SiliconFlow comparison stay visible without leaving the route.

Real requests return X-Request-Id, and buffered calls can expose route reason, billed cost, and uncached cost directly.

BatchIn supports Playground validation first, then batch, white-label, or dedicated capacity conversations.

Talk to the team Open pricing

Quick start

OpenAI-compatible surface. Swap the base URL and ship.

Try in Playground Open pricing Talk to the team

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.luminapath.tech/v1",
    api_key="BATCHIN_API_KEY"
)

resp = client.chat.completions.create(
    model="ernie-4.5-300b",
    messages=[{"role": "user", "content": "Summarize why this model is a fit for my workload."}]
)

print(resp.choices[0].message.content)

JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.luminapath.tech/v1",
  apiKey: process.env.BATCHIN_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "ernie-4.5-300b",
  messages: [{ role: "user", content: "Summarize why this model is a fit for my workload." }],
});

console.log(resp.choices[0]?.message?.content);

cURL

curl https://api.luminapath.tech/v1/chat/completions \
  -H "Authorization: Bearer $BATCHIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ernie-4.5-300b",
    "messages": [{"role":"user","content":"Summarize why this model is a fit for my workload."}]
  }'

Specs

Architecture

MoE Transformer

Vendor group

Baidu

Context window

131K

Max output

32K

Best for

bilingual

enterprise

Related models

Back to model center

Tencent

hunyuan-a13b

Hunyuan-A13B

Compact Tencent route for low-cost bilingual chat and product assistant scenarios.

View detail

Z.ai

glm-5

GLM-5

Lower-cost GLM route for production reasoning, agents, and long-context workflows.

View detail

DeepSeek

deepseek-v3

DeepSeek V3

Stable general-purpose DeepSeek route for large-scale chat and batch workloads.

View detail

Z.ai

glm-5.1

GLM-5.1

Open-source coding flagship built for long-horizon autonomous engineering and deep reasoning.

View detail