🏆 Best Model:
x-ai/grok-3-mini-beta
LLM Model | Win % | Wins | Losses | Total Completed |
---|---|---|---|---|
x-ai/grok-3-mini-beta | 91.8% | 1381 | 123 | 1504 |
anthropic/claude-3.7-sonnet | 58.4% | 878 | 626 | 1504 |
openai/gpt-4o | 49.3% | 742 | 762 | 1504 |
deepseek/deepseek-chat-v3-0324 | 37.6% | 565 | 938 | 1503 |
google/gemini-2.0-flash-001 | 29.1% | 437 | 1067 | 1504 |
mistralai/mistral-medium-3 | 28.7% | 431 | 1073 | 1504 |
meta-llama/llama-3.3-70b-instruct | 21.7% | 326 | 1178 | 1504 |