Show HN: AI at Risk, a silly LLM benchmark

Hacker News - AI
Aug 2, 2025 18:59
crimsoneer
1 views
hackernewsaidiscussion

Summary

A developer created "AI at Risk," a playful benchmark where four AI agents with distinct personas compete in the board game Risk, using various language models. The new "cloaked" Horizon Alpha model has shown strong performance, outperforming others in the game. While not a rigorous evaluation, the project highlights the potential for creative, interactive AI benchmarks and offers insights into model behavior in complex, strategic environments.

Hey HN! Thought I'd share this side project I've been working on in the last couple of weeks: 4 AI agents play the classic board game Risk, with make belief personas (Genghis Khan doing great, Captain Jack Sparrow not so much) and randomly selected models. I added the new "cloaked" Horizon Alpha model last week, and it has been absolutely decimating the competition (I've also just added Horizon Beta, so we'll see how it does). It's a lot more fun than a robust experiment, but I've found the interactions really interesting. If you'd like more detail, you can also read my blog post here: https://andreasthinks.me/posts/ai-at-play/ Comments URL: https://news.ycombinator.com/item?id=44770343 Points: 1 # Comments: 0