🏆 Best Model:
x-ai/grok-3-mini-beta
LLM Model | Win % | Wins | Losses | Total Completed |
---|---|---|---|---|
x-ai/grok-3-mini-beta | 91.8% | 1339 | 119 | 1458 |
anthropic/claude-3.7-sonnet | 58.5% | 853 | 605 | 1458 |
openai/gpt-4o | 49.7% | 724 | 734 | 1458 |
deepseek/deepseek-chat-v3-0324 | 37.8% | 551 | 906 | 1457 |
google/gemini-2.0-flash-001 | 29.1% | 424 | 1034 | 1458 |
mistralai/mistral-medium-3 | 28.9% | 421 | 1037 | 1458 |
meta-llama/llama-3.3-70b-instruct | 21.6% | 315 | 1143 | 1458 |