OpenAI and Anthropic have entered into a partnership with the AI Safety Institute, part of the National Institute of Standards and Technology (NIST), to enhance AI model safety through collaborative research, testing, and evaluation.
The agreement allows the AI Safety Institute to access new AI models from both companies prior to and following their public release. This mirrors the safety evaluation process utilized by the U.K.'s AI Safety Institute, where developers provide pre-released foundation models for assessment.
AI Safety Institute Director Elizabeth Kelly expressed enthusiasm about the collaboration, stating, “With these agreements in place, we look forward to advancing the science of AI safety. These agreements mark a significant milestone as we aim to responsibly guide the future of AI.”
Additionally, the AI Safety Institute will offer feedback to OpenAI and Anthropic on potential safety enhancements for their models, working closely with their counterparts at the U.K. AI Safety Institute.
Collaborating for Responsible AI
Both OpenAI and Anthropic believe their partnership with the AI Safety Institute will significantly influence the development of responsible AI regulations in the U.S. OpenAI’s Chief Strategy Officer, Jason Kwon, emphasized the institute’s crucial role, saying, “We strongly support the U.S. AI Safety Institute’s mission and look forward to collaborating to shape safety best practices and standards for AI models.”
OpenAI’s leadership has previously shown support for regulatory measures in AI development. Sam Altman, OpenAI’s CEO, reaffirmed the company's commitment to providing its models for governmental safety assessments before their public rollout.
Anthropic, having recruited members from OpenAI’s safety team, reported that it pre-submitted its Claude 3.5 Sonnet model to the U.K. AI Safety Institute prior to its public release. Anthropic co-founder Jack Clark stated, “Our collaboration with the U.S. AI Safety Institute leverages their expertise to rigorously test our models before widespread deployment. This enhances our capability to identify and mitigate risks, promoting responsible AI development.”
Regulatory Landscape
Established through an executive order from the Biden administration, the U.S. AI Safety Institute at NIST aims to encourage AI developers to submit models for safety evaluations before public release. However, the executive order lacks the force of law and cannot impose penalties on companies that opt not to comply. NIST has clarified that while submitting models for evaluation remains voluntary, it is an important step toward the safe and trustworthy development of AI technologies.
Meanwhile, the National Telecommunications and Information Administration will begin assessing the implications of open-weight models within the current ecosystem, though it acknowledged the challenges of monitoring all open models actively.
While the agreement between the U.S. AI Safety Institute and leading AI developers represents progress toward regulating model safety, concerns remain regarding the vagueness of the term "safety" and the absence of clear regulations, which complicates the field.
Advocates for AI safety view the agreement as a “step in the right direction.” Nicole Gill, executive director and co-founder of Accountable Tech, stressed the importance of accountability: “The more insight regulators gain into the rapid development of AI, the safer the products will be. NIST must ensure that OpenAI and Anthropic uphold their commitments, as both have previously made promises, such as the AI Election Accord, with limited follow-through. Voluntary commitments from AI leaders pave the way for safety, but only if they are honored.”