Show HN: AI at Risk, a silly LLM benchmark

Hacker News - AI
Aug 2, 2025 18:59
crimsoneer
1 views
hackernewsaidiscussion

Summary

A developer created "AI at Risk," a playful benchmark where four AI agents with distinct personas compete in the board game Risk, using various language models. The new "cloaked" Horizon Alpha model has shown strong performance, outperforming others in the game. While not a rigorous evaluation, the project highlights the potential for creative, interactive AI benchmarks and offers insights into model behavior in complex, strategic environments.

Hey HN! Thought I'd share this side project I've been working on in the last couple of weeks: 4 AI agents play the classic board game Risk, with make belief personas (Genghis Khan doing great, Captain Jack Sparrow not so much) and randomly selected models. I added the new "cloaked" Horizon Alpha model last week, and it has been absolutely decimating the competition (I've also just added Horizon Beta, so we'll see how it does). It's a lot more fun than a robust experiment, but I've found the interactions really interesting. If you'd like more detail, you can also read my blog post here: https://andreasthinks.me/posts/ai-at-play/ Comments URL: https://news.ycombinator.com/item?id=44770343 Points: 1 # Comments: 0

Related Articles

Does this look like a real woman? AI model in Vogue

Hacker News - AIAug 3

A Vogue article explores the use of AI-generated models, questioning their realism and impact on the fashion industry. The piece highlights how advanced AI can now create highly convincing images of people, raising concerns about authenticity, representation, and the potential for AI to disrupt traditional modeling. This development underscores ongoing debates about the ethical and societal implications of AI in creative industries.

Must-Know Tech Trends for 2025: From AI to Quantum Computing

Analytics InsightAug 3

The article highlights major technology trends for 2025, emphasizing the growing impact of AI advancements such as generative models, autonomous systems, and AI-driven automation across industries. It also discusses the convergence of AI with emerging technologies like quantum computing, which is expected to accelerate AI research and unlock new capabilities. These trends signal increased innovation, efficiency, and transformative potential in the AI field.

Best Python Libraries for Generative AI in 2025

Analytics InsightAug 3

The article highlights the top Python libraries driving generative AI innovation in 2025, including TensorFlow, PyTorch, Hugging Face Transformers, and emerging tools like Diffusers and LangChain. It discusses how these libraries enable rapid development of advanced models for text, image, and multimodal generation. The widespread adoption of these tools is accelerating research and expanding practical applications of generative AI across industries.