๐ฎ Lmgame Bench: Leaderboard ๐ฒ
๐ Data Visualization
๐ก Click a legend entry to isolate that model. Double-click additional ones to add them for comparison.
1 19
๐ฎ Game Selection
๐ฎ Super Mario Bros
๐ฆ Sokoban
๐ข 2048
๐ฌ Candy Crush
๐ฏ Tetris
โ๏ธ Ace Attorney
โฐ Time Tracker
03/25/2025
๐ Controls
๐ Detailed Results
All data analysis can be replicated by checking this Jupyter notebook
Player | Organization | Super Mario Bros
Score | Sokoban
Score | 2048
Score | Candy Crush
Score | Tetris
Score | Ace Attorney
Score | |
---|---|---|---|---|---|---|---|---|
gamingagent + gemini-2.5-flash-preview-04-17 (thinking) | anthropic | 1498.3 | 2.33 | 3586.67 | 491.67 | 33.67 | 3.67 |
Note: 'n/a' in the table indicates no data point for that model.
๐ Data Visualization
๐ก Click a legend entry to isolate that model. Double-click additional ones to add them for comparison.
1 14
๐ฎ Game Selection
๐ฎ Super Mario Bros
๐ฆ Sokoban
๐ข 2048
๐ฌ Candy Crush
๐ฏ Tetris
โ๏ธ Ace Attorney
โฐ Time Tracker
03/25/2025
๐ Controls
๐ Detailed Results
Player | Organization | Super Mario Bros
Score | Sokoban
Score | 2048
Score | Candy Crush
Score | Tetris
Score | Ace Attorney
Score | |
---|---|---|---|---|---|---|---|---|
gemini-2.5-flash-preview-04-17 (thinking) | anthropic | 1991.3 | 1.33 | 128.2 | 557.67 | 13.67 | 1.33 |
Player | Organization | Super Mario Bros
Score | Sokoban
Score | 2048
Score | Candy Crush
Score | Tetris
Score | Ace Attorney
Score | |
---|---|---|---|---|---|---|---|---|
1 | claude-sonnet-4-20250514 | anthropic | n/a | 0 | 3844 | 557.67 | 13.67 | 1.33 |
2 | gemini-2.5-flash-preview-05-20 | google | n/a | 0 | 2750 | 254 | 16 | 2.33 |
3 | o3-2025-04-16 | openai | 1955 | 2 | 128.2 | 106 | 31 | 8 |
4 | gpt-4.1-2025-04-14 | openai | 1991.3 | 0 | 94.5 | 101 | 13 | 0 |
5 | gemini-2.5-flash-preview-04-17 (thinking) | google | 1540.7 | 0 | 97.7 | 97.7 | 19 | 1 |
6 | claude-3-7-sonnet-20250219 (thinking) | anthropic | 1430 | 0 | 126.3 | 126.3 | 13 | 3 |
7 | o1-2024-12-17 | openai | 1434 | 0 | 128.1 | 90 | 13 | 3 |
8 | claude-3-5-sonnet-20241022 | anthropic | 1540 | 0 | 17 | 17 | 12.3 | 1 |
9 | o4-mini-2025-04-16 | openai | 1348.3 | 1.33 | 97.6 | 110.7 | 15 | 2 |
10 | gemini-2.5-pro-preview-05-06 (thinking) | google | 1025.3 | 1 | 120.5 | 177.3 | 12.3 | 8 |
11 | random (x30) | unknown | 986.97 | 0 | 100.4 | 116.5 | 10.2 | 0 |
12 | gpt-4o-2024-11-20 | openai | 1028.3 | 0 | 70.4 | 59 | 14.7 | 0 |
13 | llama-4-maverick-17b-128e-instruct-fp8 | meta | 786 | 0 | 44.6 | 32.3 | 11.7 | 0 |
14 | gemini-2.5-pro-preview-06-05 | google | n/a | 0.33 | 1 | 496 | 12 | n/a |
Note: 'n/a' in the table indicates no data point for that model.
๐ฎ Super Mario Bros
๐ฆ Sokoban
๐ข 2048
๐ฌ Candy Crush
โ๏ธ Ace Attorney
๐ฐ Latest News
April 24, 2025
April 01, 2025
March 06, 2025