Here’s How OpenAI Plans to Assess the Power and Capabilities of Its AI Systems

OpenAI has developed an internal scale to monitor the progress of its large language models towards achieving artificial general intelligence (AGI), which is characterized by human-like intelligence, according to a spokesperson. Currently, chatbots like ChatGPT are categorized as Level 1. OpenAI claims it is approaching Level 2, which is defined as a system capable of solving basic problems at the level of a PhD graduate. Level 3 includes AI agents that can perform tasks on behalf of users, while Level 4 pertains to AI that generates innovations. Level 5 represents the pinnacle of AGI, where AI can effectively carry out the work of entire organizations. OpenAI describes AGI as "a highly autonomous system surpassing humans in most economically valuable tasks."

The structure of OpenAI is fundamentally aligned with its mission to achieve AGI, and its definition of AGI plays a crucial role in this pursuit. The company has expressed that it would support any safety-conscious project that nears AGI before OpenAI does, ensuring no competition. This commitment, while somewhat ambiguous in the company’s charter, indicates a willingness to cooperate with other initiatives. A structured grading scale could provide clearer benchmarks for defining AGI's achievement.

Despite ongoing advancements, AGI remains a distant goal, necessitating vast investments in computing power. Expert timelines, including those from OpenAI, vary considerably. In October 2023, OpenAI CEO Sam Altman estimated a timeline of "five years, give or take," for reaching AGI.

This new internal grading scale was introduced shortly after OpenAI announced a collaboration with Los Alamos National Laboratory aimed at exploring how advanced AI models like GPT-4o can safely assist in bioscientific research. The partnership seeks to evaluate GPT-4o's capabilities and establish a framework for safety assessments relevant to the U.S. government. This framework may later serve to evaluate other public or private models as well.

In May, OpenAI disbanded its safety team following the departure of their leader, co-founder Ilya Sutskever. Jan Leike, a prominent researcher at OpenAI, resigned shortly thereafter, citing concerns that the company's safety culture was being overshadowed by product development. While OpenAI has denied these claims, there are growing apprehensions regarding the implications of reaching AGI without robust safety measures in place.

Details on how OpenAI categorizes its models within this internal scale remain undisclosed. However, during an all-hands meeting, company leaders demonstrated a research project utilizing the GPT-4 model, suggesting that it showcases new capabilities indicative of human-like reasoning. This grading scale has the potential to establish a clearer framework for evaluating progress instead of relying on subjective interpretation. For example, OpenAI's CTO Mira Murati mentioned in June that the models available in their labs are comparable to public versions, whereas CEO Sam Altman indicated last year that the company has enhanced its models significantly.

Most people like

Find AI tools in YBX