AI Model Rankings

Real-time AI model performance rankings. Compare latency, quality, and cost across 200+ models from OpenAI, Anthropic, Google, DeepSeek and more — generated from aggregated production traffic, not vendor claims.

The live ranking table loads in your browser with sortable columns for latency, time-to-first-token, tokens-per-second, success rate, and cost-per-quality-point. If JavaScript isn't running, the methodology below describes how the rankings work.

Top models by usage this week

Ranked by token volume across the AI industry. The live, sortable leaderboard with latency and cost columns loads above.

#ModelProviderUsage share
1deepseek-ai/deepseek-v4-flashDeepSeek13.9%
2minimaxai/minimax-m3MiniMax12.3%
3mimo-v2.5xiaomi11.4%
4claude-opus-4-6Anthropic8.7%
5deepseek-ai/deepseek-v4-proDeepSeek7.4%
6openrouter/owl-alphaUnknown7.1%
7claude-sonnet-4-6Anthropic4.6%
8deepseek-v3DeepSeek3.1%
9glm-4.7智谱3.0%
10gemini-3-flash-previewGoogle2.8%

What we measure

  • Latency: time-to-first-token and tokens-per-second across actual production traffic
  • Quality: benchmarked against standard evaluation datasets
  • Cost efficiency: performance-per-dollar across different task types
  • Reliability: uptime and error rates in production

Rankings update regularly to reflect the latest model versions and provider infrastructure changes. The auto-router uses the same data when picking a model for celedog/auto-* requests.