Ilya Sutskever, former Chief Scientist and co-founder of OpenAI, was not the only major exit from the company yesterday. He was soon followed by Jan Leike, co-lead of OpenAI’s "superalignment" team, who announced his resignation on X with the message, “I resigned.”
Leike joined OpenAI in early 2021, expressing his enthusiasm for the company's work on reward modeling, particularly in aligning GPT-3 with human preferences. He later shared his optimism about the alignment strategies at OpenAI on his Substack account “Aligned” in December 2022. Before OpenAI, Leike contributed his expertise at Google’s DeepMind AI laboratory.
The departure of these two leaders sparked discussions on X about the implications for OpenAI's ambitions to control advanced AI systems, including the overarching goal of achieving artificial general intelligence (AGI), defined as AI that exceeds human performance in economically valuable tasks.
What is Superalignment?
Large language models (LLMs), like OpenAI's GPT-4o and competitors such as Google's Gemini and Meta's Llama, operate in complex ways. To ensure these models perform reliably and do not produce harmful or nonsensical responses, developers must "align" them to desired behaviors. This involves machine learning techniques like reinforcement learning and proximal policy optimization (PPO).
Superalignment represents an intensified effort to align future AI models—superintelligences—beyond what is currently available. OpenAI announced the creation of the superalignment team in July 2023, emphasizing the urgency of managing risks associated with AI development and governance.
The challenge lies in how to ensure superintelligent AI systems adhere to human intentions. Current alignment techniques, which rely on human oversight, may not scale effectively to systems surpassing human intelligence. OpenAI recognized this issue and committed to dedicating 20% of its computing resources to the superalignment effort, utilizing valuable GPUs from Nvidia and other hardware.
What’s Next for Superalignment?
With both Sutskever and Leike departing, critical questions arise about the future of the superalignment initiative. Will OpenAI continue to allocate the promised 20% of its computing power to this project, or will it pivot in a new direction? Observers note that Sutskever was perceived as a "doomer," particularly regarding existential risks presented by AI, in contrast to CEO Sam Altman and others at OpenAI, who seem less focused on these threats.
We have reached out to OpenAI for clarification on the future of the superalignment team and will provide updates once we receive a response.