We Taught AI to Work, Not Play

In 2021, I moved 3,000 miles away from my friends. The only thing that bridged the gap wasn't FaceTime or group chats. It was late-night Discord sessions, screaming at Valorant and later chasing dubs in Fortnite.

This isn't new. Growing up, it was my family at the living room table, dealing endless hands of rummy. Today, it's a long-running Scrabble game with my wife, low-stakes online poker, and correspondence chess with my brother. Games are the quintessential "third place," the space where we can just be with people.

They're the perfect medium for authentic interaction. They can have infinite skill ceilings in games like chess, or have some luck in games like poker. They can be single player like blackjack, or require multiple "agents" like League of Legends.

And AI used to get this.

The History of AI is a History of Games

The history of AI is a history of games. It's a cliché for a reason. Deep Blue, AlphaGo, OpenAI Five, Cicero, every major breakthrough was benchmarked in the clean room of a game. They were perfect labs with clear rules, measurable outcomes, and deep strategic spaces.

And then LLMs happened. And the games just... stopped.

Suddenly, the focus shifted entirely to productivity. LLMs got incredibly good at writing, reasoning, and coding. But they became terrible at playing. Some papers like PokerBench revealed that models like GPT-4 fail because poker hands are basically absent in their training data. We even saw GPT-3.5-turbo's decent chess ability get wiped out by instruction tuning (I think this is a token probability problem but I am not sure… will look into it).

It's like we took this nascent intelligence and exclusively trained it for the office. We instruction-tuned the fun right out of it. We made it an almost-competent intern, but in the process, we killed the playful spark.

This is Completely Backward

Games should matter more now, not less.

Lest we want LLMs to be stuck forever as tools, they need to learn to play with you. They don't develop a personality through interaction. They don't remember that you always bluff on the river or get genuinely tilted after a bad beat. They have no memory, no presence. (And I don't think vector databases, large context windows, summarization models, are true memory, but that's a post for another day).

The early attempts to fix this are missing the point. Microsoft's Xbox Copilot is just a glorified hint system. That's not what we need. I don't want a tool that helps me win; I want an AI that I can hang out with.

This is why I'm building llm.poker. Right now, playing against an LLM feels sterile, like facing a polite computer that doesn't understand the stakes. But the potential is there. Maybe they can learn to play exploitatively too.

The Next Leap

The next leap isn't about more parameters or better benchmarks on enterprise tasks. It's about presence. It's about multimodal AI that can see your expression, hear the hesitation in your voice, and learn your tells. An AI that can dish out some good-natured trash talk because it knows your style. An AI that can be genuinely silly, not just professionally helpful.

I hope the next LLMs learn how to actually show up and play.