Performance & Quality
Frontier Leaderboard
A public-source leaderboard framework for frontier models using public human-preference and hard-reasoning benchmark signals.
How to use this dashboard
A public-source leaderboard framework for frontier models using public human-preference and hard-reasoning benchmark signals.
Use this leaderboard to compare frontier-model quality signals across public benchmark, arena, and model-catalog references.
Frontier Leaderboard
40 records| 1 | Google Gemini Pro Latest | 95 | Public benchmark slot | Reasoning signal | Gemini | audio, file, image, text, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 2 | Google Gemini Flash Latest | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, file, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 3 | Google: Gemini 3.1 Flash Lite Preview | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, video, file, audio | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 4 | Google: Gemini 3.1 Pro Preview Custom Tools | 95 | Public benchmark slot | Reasoning signal | Gemini | text, audio, image, video, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 5 | Google: Gemini 3.1 Pro Preview | 95 | Public benchmark slot | Reasoning signal | Gemini | audio, file, image, text, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 6 | Google: Gemini 3 Flash Preview | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, file, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 7 | Google: Gemini 2.5 Flash Lite Preview 09-2025 | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, file, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 8 | Google: Gemini 2.5 Flash Lite | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, file, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 9 | Google: Gemini 2.5 Flash | 95 | Public benchmark slot | Reasoning signal | Gemini | file, image, text, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 10 | Google: Gemini 2.5 Pro | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, file, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 11 | Google: Gemini 2.5 Pro Preview 06-05 | 95 | Public benchmark slot | Reasoning signal | Gemini | file, image, text, audio | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 12 | Google: Gemini 2.5 Pro Preview 05-06 | 95 | Public benchmark slot | Reasoning signal | Gemini | text, image, file, audio, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. | |
| 13 | Anthropic Claude Sonnet Latest | ~Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 14 | OpenAI GPT Latest | ~Openai | 91 | Public benchmark slot | Reasoning signal | GPT | file, image, text | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 15 | Qwen: Qwen3.5 Plus 2026-04-20 | Qwen | 91 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 16 | Qwen: Qwen3.6 Flash | Qwen | 91 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 17 | OpenAI: GPT-5.5 Pro | Openai | 91 | Public benchmark slot | Reasoning signal | GPT | file, image, text | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 18 | OpenAI: GPT-5.5 | Openai | 91 | Public benchmark slot | Reasoning signal | GPT | file, image, text | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 19 | Anthropic: Claude Opus Latest | ~Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 20 | Anthropic: Claude Opus 4.7 | Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 21 | Anthropic: Claude Opus 4.6 (Fast) | Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 22 | Qwen: Qwen3.6 Plus | Qwen | 91 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 23 | xAI: Grok 4.20 Multi-Agent | X Ai | 91 | Public benchmark slot | Reasoning signal | Grok | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 24 | xAI: Grok 4.20 | X Ai | 91 | Public benchmark slot | Reasoning signal | Grok | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 25 | OpenAI: GPT-5.4 Pro | Openai | 91 | Public benchmark slot | Reasoning signal | GPT | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 26 | OpenAI: GPT-5.4 | Openai | 91 | Public benchmark slot | Reasoning signal | GPT | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 27 | Qwen: Qwen3.5-Flash | Qwen | 91 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 28 | Anthropic: Claude Sonnet 4.6 | Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 29 | Qwen: Qwen3.5 Plus 2026-02-15 | Qwen | 91 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 30 | Anthropic: Claude Opus 4.6 | Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 31 | xAI: Grok 4.1 Fast | X Ai | 91 | Public benchmark slot | Reasoning signal | Grok | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 32 | Anthropic: Claude Sonnet 4.5 | Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 33 | xAI: Grok 4 Fast | X Ai | 91 | Public benchmark slot | Reasoning signal | Grok | text, image, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 34 | Anthropic: Claude Sonnet 4 | Anthropic | 91 | Public benchmark slot | Reasoning signal | Claude | image, text, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 35 | OpenAI: GPT-4.1 | Openai | 91 | Public benchmark slot | Reasoning signal | GPT | image, text, file | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 36 | Anthropic Claude Haiku Latest | ~Anthropic | 85 | Public benchmark slot | Reasoning signal | Claude | image, text | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 37 | OpenAI GPT Mini Latest | ~Openai | 85 | Public benchmark slot | Reasoning signal | GPT | file, image, text | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 38 | MoonshotAI Kimi Latest | ~Moonshotai | 85 | Public benchmark slot | Reasoning signal | Kimi / Moonshot | text, image | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 39 | Qwen: Qwen3.6 35B A3B | Qwen | 85 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |
| 40 | Qwen: Qwen3.6 27B | Qwen | 85 | Public benchmark slot | Reasoning signal | Qwen | text, image, video | Free proxy from public model metadata. Replace with sourced HLE/GPQA/LMArena scores when exact rows are available. |