OpenAI is committed to fulfilling its promise of "openness" in the AI landscape.
While the company hasn’t made its new models open source, it is actively addressing AI’s impact on society, including challenges like disinformation and deepfakes. This week, OpenAI introduced "Model Spec," a framework document aimed at guiding the behavior of AI models accessible via its application programming interface (API) and ChatGPT. OpenAI is seeking public feedback on this document through a web form available until May 22.
As OpenAI co-founder and CEO Sam Altman noted on X, “We will listen, debate, and adapt this over time, but I think it will be very useful to clarify when something is a bug versus a decision.”
Why Release a Model Spec?
The launch of Model Spec aligns with OpenAI's mission to ensure that AI technologies serve users safely and beneficially. However, achieving this goal is complex and often intersects with long-standing philosophical debates about technology and society.
OpenAI highlighted in its blog post: “Even if a model is intended to be broadly beneficial, practical applications may conflict. For instance, a security company might use synthetic data to develop anti-phishing tools, but the same capability could be exploited by scammers.”
By sharing this initial draft, OpenAI invites the public to engage in discussions about the ethical and practical facets of AI development. Users have two weeks to submit their insights via OpenAI’s feedback form.
Following this period, OpenAI plans to publish updates on modifications to the Model Spec, responses to user feedback, and progress on shaping model behavior over the coming year.
Although OpenAI hasn't detailed how Model Spec will influence AI behavior or if its principles will be integrated into the "system prompt" used for model alignment, it is expected to have significant implications.
In a way, Model Spec resembles the "constitutional" approach of rival Anthropic AI—a concept that initially differentiated Anthropic but has not been as emphasized recently.
Framework for AI Behavior
Model Spec consists of three core components: objectives, rules, and default behaviors, which guide AI interactions ensuring both efficacy and ethical standards.
- Objectives: The document outlines broad principles aimed at aiding developers and users. These include facilitating user goals efficiently, considering diverse stakeholder impacts, and enhancing community welfare.
- Rules: Clear rules are established to navigate AI interactions, ensuring compliance with applicable laws, respect for intellectual property, privacy protection, and a prohibition against unsafe content.
- Default Behaviors: The guidelines stress the importance of assuming good intentions, seeking clarification when necessary, and maximizing helpfulness without overreach. This approach aims to balance the diverse needs of users.
Some, including AI influencer and Wharton School professor Ethan Mollick, have compared these principles to Isaac Asimov’s fictional "Three Laws of Robotics" from 1942.
However, there have been criticisms regarding the implementation of Model Spec, particularly concerning how it influences AI responses. Tech writer Andrew Curran noted an example where an “AI Assistant” fails to challenge a user’s incorrect claim that the Earth is flat.
Continuous Engagement and Development
OpenAI acknowledges that Model Spec is a living document, reflecting both current practices and a commitment to adapt based on ongoing research and public input. The organization aims to gather diverse perspectives, especially from global stakeholders like policymakers and domain experts.
Feedback will significantly inform the refinement of Model Spec and future AI developments. OpenAI intends to keep the public informed about changes and insights gained from this feedback loop, reaffirming its commitment to responsible AI development.
Where to Go from Here?
By clearly articulating desired AI behavior through its Model Spec and soliciting international community input, OpenAI strives to create a positive societal impact for AI—even amid legal scrutiny and criticisms regarding artist consent in training data.