-->
At Sequoia AI Ascent 2024, Andrej Karpathy delivered a thoughtful talk on the current state of AI and how to foster a thriving AI ecosystem. Drawing analogies from computing history and sharing insights from his experiences, Karpathy offered a roadmap for the future of AI. Watch the full talk | Read the transcript
Here are the key takeaways:
While the industry vaunts its ever bigger and better models, Karpathy sees a more heterogeneous architecture in the industry’s future. From the bizarrely large amounts of data, compute, and electrical power needed to train models, to the weak use of reinforcement learning in model training, he sees signs that the industry is ripe for a change in direction if not outright disruption.
Rather than an all-in-one system, Andrej compares LLMs to operating system at the heart of an AI system, similar to how a desktop computer’s OS supports applications and peripherals, or how the iPhone became the heart of an app-ecosystem.
Karpathy compared the current state of AI to the early days of AlphaGo:
Today’s reinforcement learning methods, like RLHF, are primitive in comparison, relying heavily on human feedback—essentially a “vibe-check” rather than rigorous learning. True self-reinforcement, like humans questioning and answering themselves, remains limited to niches like gameplay.
Karpathy highlighted the staggering inefficiency of current AI systems:
Future improvements may include:
Karpathy finds it strange that diffusion models (used for generating images) and transformers (dominant in language tasks) remain so distinct. He sees an opportunity to merge these approaches into a unified architecture.
While Karpathy admires the transformer’s ubiquity, he doubts it is the ultimate neural network architecture. Its dominance is largely due to its compatibility with GPUs, not its intrinsic superiority. Future architectures will likely evolve alongside new hardware.
Karpathy favors openness and has a general concern about large AI companies hampering innovation. But he sees a vibrant ecosystem of research and entrepreneurship at present, which would only be improved if true open source models were available.
Karpathy was cautious about overhyping “open source” models like LLaMA, which he argued are not truly open-source but merely “open weights.”
Karpathy’s advice to AI founders:
Karpathy emphasized the need to make AI more accessible:
Reflecting on his time at Tesla, Karpathy shared insights into Musk’s management style:
Karpathy’s talk paints a picture of an AI ecosystem poised between exciting possibilities and significant challenges. From addressing inefficiencies to fostering accessibility, and from balancing corporate consolidation with startup innovation, his insights chart a course for the AI ecosystem to thrive.
For more, check out the video and full transcript.