Skip to content
Wayne Goodchild
Wayne Goodchild Editor
Fact checked by: Jorgen Johansson
Updated: March 4, 2025
Hao Lab Test AI Models On Super Mario

Hao AI Lab, part of the University of California San Diego, recently put a variety of AI models through their paces by testing them on the original Super Mario from 1985. The objective was to help create Gaming Models that can be applied to future AI use.

Hao AI Lab’s main aim is to democratize large machine learning systems, so that they can be used by anyone for reasons beyond their initial creation. For the Super Mario test, Claude-3.7, a recent superstar in the AI-plays-games arena, performed the best, with GPT-4o being less impressive.

 “We threw AI gaming agents into LIVE Super Mario games and found Claude-3.7 outperformed other models with simple heuristics,” Hao AI Lab said in a recent post on X (Twitter). 

“We believe games provide challenging and dynamic environments for testing LLM (Language Learning Model) agents.”

Gaming Models

LMGames is the name of the Hao Lab team behind the recent Super Mario test, and as part of what it’s calling the GamingAgent project the team has released the relevant source code. This is published under an MIT licence, meaning it’s free for use and can be adapted or refined, as long as any subsequent project uses the same licence. 

At the moment, the GamingAgent works with 2084 and Tetris, as well as Super Mario, and it can handle a few AI models created by OpenAI, Anthropic and Gemini. As such, there’s plenty of room for the code to be expanded to run on other games and AI models.

AI Benchmarks

Training AI by getting it to play games is not a new idea. Back in 2019, Greg Brockman, an overseer for OpenAI, used the now-popular LLM on a variety of games to test and refine its reasoning capabilities.

“Games have always been a benchmark for AI. If you can’t solve games, you can’t expect to solve anything else,” he said in a New York Times article.

Wayne Goodchild

Wayne Goodchild

Editor

Editor, occasional game dev, constant dad, horror writer, noisy musician. I love games that put effort into fun mechanics, even if there’s a bit of jank here and there. I’m also really keen on indie dev news. My first experience with video games was through the Game and Watch version of Donkey Kong, because I’m older than I look.