🏆 Best Model:
x-ai/grok-3-mini-beta
LLM Model | Win % | Wins | Losses | Total Completed |
---|---|---|---|---|
x-ai/grok-3-mini-beta | 91.9% | 1423 | 126 | 1549 |
anthropic/claude-3.7-sonnet | 58.2% | 902 | 647 | 1549 |
openai/gpt-4o | 49.4% | 765 | 784 | 1549 |
deepseek/deepseek-chat-v3-0324 | 37.9% | 586 | 962 | 1548 |
google/gemini-2.0-flash-001 | 29.4% | 455 | 1094 | 1549 |
mistralai/mistral-medium-3 | 29% | 449 | 1100 | 1549 |
meta-llama/llama-3.3-70b-instruct | 21.8% | 337 | 1212 | 1549 |