Use Celedog Auto-routing for Cost Optimization
Auto-routing is a 1-line change that lets Celedog pick the right concrete model per request. Cheap models handle simple prompts, strong models handle hard ones, fallback handles outages.
Auto-routing replaces a fixed model name with a strategy, and lets Celedog pick the concrete model per request. Done right it trims 20–40% off an inference bill with no measurable quality drop, because most requests are simple enough that a budget model handles them perfectly.
The idea
Instead of hard-coding "gpt-5.5", you send a meta-model such as "celedog/auto-cheapest". The gateway evaluates the request and routes it to the lowest-cost model that clears your quality bar, falling back automatically if an upstream errors. You keep one model string in your code and let the gateway optimise.
Using it
resp = client.chat.completions.create(
model="celedog/auto-cheapest",
messages=[{"role": "user", "content": "Summarise this email in one line."}],
)
print(resp.model) # the concrete model that actually served it
print(resp.choices[0].message.content)
Note resp.model: Celedog tells you which concrete model served each request, so the routing decision is auditable rather than a black box.
Choosing a strategy
- auto-cheapest — lowest cost that meets the quality bar. Best default for high-volume, simple tasks (classification, extraction, short summaries).
- auto-fastest — lowest latency. Use for interactive UIs where time-to-first-token matters most.
- auto-quality — highest-quality model. Reserve for hard reasoning, long-form generation, or agentic coding.
A practical pattern: tier by task
Most apps have a mix of easy and hard calls. Route the easy ones (formatting, tagging, routing user intent) through auto-cheapest and the hard ones (final answer, code generation) through auto-quality. You pay premium prices only where they earn their keep.
Measuring the win
Use the Settlement / usage dashboard to compare cost-per-request before and after. The typical result: a large share of traffic silently moves to a budget model, the bill drops, and user-facing quality is unchanged because those requests never needed a frontier model.
Auto-routing is the single highest-leverage cost lever a gateway offers: one string change, a fifth to two-fifths off the bill, and full visibility into which model served each call.
Written by Celedog Team · Last updated May 28, 2026
Where to go next
- Try Celedog — free credits on signup, no card required.
- API documentation
- Per-model pricing
- More Celedog Tutorials