Show HN: AI at Risk, a silly LLM benchmark

Hacker News - AI

Aug 2, 2025 18:59

crimsoneer

1 views

hackernewsaidiscussion

Summary

A developer created "AI at Risk," a playful benchmark where four AI agents with distinct personas compete in the board game Risk, using various language models. The new "cloaked" Horizon Alpha model has shown strong performance, outperforming others in the game. While not a rigorous evaluation, the project highlights the potential for creative, interactive AI benchmarks and offers insights into model behavior in complex, strategic environments.

Hey HN! Thought I'd share this side project I've been working on in the last couple of weeks: 4 AI agents play the classic board game Risk, with make belief personas (Genghis Khan doing great, Captain Jack Sparrow not so much) and randomly selected models. I added the new "cloaked" Horizon Alpha model last week, and it has been absolutely decimating the competition (I've also just added Horizon Beta, so we'll see how it does). It's a lot more fun than a robust experiment, but I've found the interactions really interesting. If you'd like more detail, you can also read my blog post here: https://andreasthinks.me/posts/ai-at-play/ Comments URL: https://news.ycombinator.com/item?id=44770343 Points: 1 # Comments: 0

Read Full Article More News

Forget Tron (TRX), Analysts Say Ruvi AI (RUVI) Is the Real 13,000% ROI Play of This Cycle Thanks to Early Entry Bonuses and CoinMarketCap Listing

Analytics InsightAug 2

Analysts predict Ruvi AI (RUVI) could deliver a 13,000% ROI this cycle, surpassing established cryptocurrencies like Tron (TRX), due to its early entry bonuses and recent CoinMarketCap listing. This highlights growing investor interest in AI-driven crypto projects and suggests that innovative AI integrations are becoming a key factor in the digital asset market.

AI Thinking, Fast and Slow

Hacker News - AIAug 2

The article "AI Thinking, Fast and Slow" explores the parallels between human cognition—specifically Daniel Kahneman's concepts of fast (intuitive) and slow (deliberative) thinking—and current AI systems. It discusses how most AI today excels at "fast" pattern recognition tasks but struggles with "slow," reasoning-based challenges, highlighting the need for future AI development to better integrate both modes for more robust intelligence.

Show HN: Rudys.ai, Scale Google Ads Globally in Any Language