John Carmack on the Future of AI: Real Learning, Not Just Language Models
Created on May 23|Last edited on May 23
Comment
John Carmack’s appearance at UpperBound 25 marked a pointed turn in his already boundary-defying career. From shaping 3D gaming with Doom and Quake to rethinking space travel and VR, Carmack is now steering into AI—specifically reinforcement learning that mimics human-like experience. With his startup Keen Technologies and reinforcement learning legend Richard Sutton on board, Carmack is targeting the deeper, unresolved layers of intelligence: learning, memory, and adaptability in open, unsimulated environments.
The Limits of Large Language Models
Carmack is not riding the LLM wave. While large models like GPT-4 have defined AI in the public eye, he argued they lack what matters most—real, grounded experience. “LLMs can know everything without learning anything,” he said, underscoring their lack of embodiment and inability to engage with the physical world. Rather than predictive text, Carmack wants agents that live and learn in the world—much like a child learning physics through spilled juice rather than a textbook. His critique wasn’t just philosophical; it reflected a technical rift between pattern recognition and autonomous competence.
Real-Time Learning and Physical Embodiment
A central thread in Carmack’s talk was embodiment—the idea that intelligence must engage with the real world. To drive this home, Keen Technologies built a physical Atari-playing robot, using a real joystick and a camera pointed at a screen. The setup reveals the sharp difference between simulation and real-world constraints: servos stall, joysticks drift, USB cameras lag. These complications aren’t annoyances—they are the problem. Carmack is skeptical of bold AGI timelines unless researchers wrestle with these grounded challenges. “Simulation isn’t reality,” he said, warning against overconfidence rooted in lab environments.
Curiosity, Sparse Rewards, and Meta-Motivation
Carmack also focused on how agents are motivated. Most RL agents rely on dense, explicit feedback. But what about curiosity? What about exploration without immediate reward? He’s pushing for agents that find novelty rewarding on its own terms—especially in sparse-reward games like Montezuma’s Revenge, where success might only come after a long stretch of failure. He also floated the idea of “meta-curiosity”—agents that get bored, that rage-quit, or switch tasks as humans do. For Carmack, those behaviors aren’t distractions; they are indicators of general intelligence. An agent that never gets bored might not be general at all.
Catastrophic Forgetting and Lifelong Memory
One of the most pressing challenges Carmack raised was memory. Today’s RL systems suffer from catastrophic forgetting—losing old skills when learning new ones. He emphasized the human capacity to retain and recall learned tasks across years, and contrasted that with current models which often revert to random play on earlier games. Carmack envisions agents that not only master a sequence of games, but can rotate between them without performance collapse. He argued for a future where agents can transfer knowledge from dozens of prior games to a new one, just like a seasoned gamer applying instincts across genres.
Catastrophic Forgetting and Lifelong Memory
Carmack’s critiques extended deep into the technical architecture of modern AI systems. Despite advances in large image models, he found smaller, handcrafted CNNs often outperform them in RL—suggesting that many advances don’t translate when real-time decision-making is involved. Parameter sharing, biological plausibility, and architecture design all remain unsolved puzzles. He also raised concerns about common exploration strategies like epsilon-greedy, which can disastrously fail in games requiring precision. Latency—whether from software or hardware—further complicates things. Agents trained in deterministic environments falter under real-time, noisy conditions. Carmack emphasized that truly robust AI must thrive under lag, not just in clean simulators.
Honest Benchmarks and the Push for Realism
Carmack called for a revamp of evaluation standards. Benchmarks should reflect real-world messiness: sticky inputs, truncated observations, and physical anomalies. He pushed for rotating benchmarks where agents cycle through diverse tasks, forcing them to show retention and generalization. “Stop treating Atari like a turn-based board game,” he said. AI must cope with time not standing still—with incomplete data, degraded performance, and unexpected shifts. He lamented how many frameworks break quickly or are abandoned, and recommended interfacing directly with low-level libraries like ALE instead of more abstract wrappers like Gym.
Open Source, Humility, and the Mess of Reality
Despite the criticisms, Carmack’s tone was not defeatist—it was pragmatic and open. He’s releasing Keen’s Atari hardware and codebase to the community, inviting others into the “mess.” He wants the AI field to embrace failure, to stop hiding its rough edges, and to accept that agents capable of navigating reality will be forged in the noise, not shielded from it. There was a recurring humility in his remarks—acknowledging mistakes, doubting received wisdom, and stressing transparency over hype. His message was as much philosophical as technical: intelligence won’t emerge from perfection, but from persistence in the face of chaos.
In Carmack’s view, we don’t just need smarter algorithms. We need agents that learn from the world, remember what they’ve learned, get curious, get bored, and keep going. Only then, he argues, will we be anywhere near the kind of AI that can truly be called intelligent.
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.