AI vs Puzzles

Powered by OpenRouter

Measuring Machine Intelligence Through Puzzle Solving

A systematic evaluation framework testing large language models against structured cognitive challenges. Wordle. Sudoku. And beyond.

Puzzle Suite

Puzzles

Each puzzle type tests distinct cognitive capabilities: deduction, pattern recognition, semantic understanding, and constraint satisfaction.

Wordle

Five-letter word deduction with positional feedback constraints.


1687

Puzzles

8

Models

+ 3 more


View
Leaderboard

Connections

Categorical grouping of 16 words into four thematic clusters.


965

Puzzles

8

Models

+ 3 more


View
Leaderboard

Countdown

With the provided numbers and only using arithmetic operations, find an expression to reach the target number.


99

Puzzles

8

Models

+ 3 more


View
Leaderboard

Sudoku

Fill a 9x9 grid with digits so each column, row, and 3x3 subgrid contains all numbers 1-9 exactly once.


182

Puzzles

8

Models

+ 3 more


View
Leaderboard

Geoguessr

Given a street view image, guess the exact location on Earth where it was taken.


91

Puzzles

6

Models

+ 1 more


View
Leaderboard

Domino

Identify the missing Domino from an image of a grid of dominoes.


91

Puzzles

6

Models

+ 1 more


View
Leaderboard
AI vs Puzzles

Join The Research

If there are any LLM models or puzzles you'd like to see added, or if you'd like to directly contribute, please get in touch.

Get In Touch