OpenAI Strengthens Safety Team and Grants Board Veto Power on High-Risk AI Decisions

Home AI News OpenAI Strengthens Safety Team and Grants Board Veto Power on High-Risk AI Decisions

Updated on October 23 2024

OpenAI is enhancing its internal safety protocols to mitigate the risks associated with advanced AI technology. A newly established “Safety Advisory Group” will oversee technical teams and offer recommendations to leadership, including the board, which now holds veto power—a point of contention regarding its actual utilization.

Typically, discussions around policy updates like these go unnoticed, as they consist largely of closed-door meetings with opaque functions that outsiders rarely see. However, given the recent leadership turmoil and the ongoing discourse surrounding AI risks, it’s essential to examine how the leading AI development company is addressing safety protocols.

In a recent document and blog post, OpenAI introduced its revamped “Preparedness Framework,” which appears to have been adjusted following the leadership shake-up that led to the departure of two board members known for their “decelerationist” stance: Ilya Sutskever (who remains in a modified role) and Helen Toner (who has left the company).

The primary goal of this update is to clarify the process for identifying, analyzing, and addressing "catastrophic" risks associated with their developing models. According to OpenAI, catastrophic risks encompass any threat that could result in hundreds of billions of dollars in economic damage or could severely harm or kill individuals—this includes existential risks, such as the potential for AI to operate independently.

OpenAI categorizes its in-production models under a “safety systems” team, which addresses systemic abuses of ChatGPT through API restrictions or fine-tuning. Models still in development fall under the "Preparedness" team, which identifies and quantifies risks prior to release. Additionally, a “superalignment” team focuses on establishing theoretical guidelines for potential “superintelligent” models.

For the first two categories, which are grounded in current technology rather than theoretical constructs, the evaluation process is straightforward. Each model is assessed across four risk categories: cybersecurity, manipulation (including disinformation), autonomy (i.e., self-directed actions), and CBRN risks (chemical, biological, radiological, and nuclear threats).

There are various assumed mitigations in place; for example, a cautious approach is taken when discussing the creation of dangerous materials. If a model still poses a "high" risk after considering these mitigations, it cannot be deployed. Any model classified with "critical" risks will not progress in development.

OpenAI's risk evaluation framework includes clear documentation, rejecting any notion that assessments are left solely to individual engineers or product managers.

For instance, within the cybersecurity category, increasing operators' productivity on significant cyber operations poses a medium risk. Conversely, a high-risk model could autonomously develop proofs of concept for high-value exploits against fortified targets. A critical risk might involve devising and executing comprehensive cyberattack strategies against such targets based merely on a broad objective. Undoubtedly, this type of capability would be concerning.

I’ve reached out to OpenAI for clarification on how these risk categories evolve—specifically if new risks, such as the creation of photorealistic fake videos, fall under manipulation or necessitate their own category—and will update this post with any responses.

Only medium and high risks are deemed tolerable within these frameworks. However, those developing the models may not always be the most qualified to evaluate them. To address this, OpenAI is establishing a “cross-functional Safety Advisory Group” that will review technical reports from engineers, providing recommendations from a broader perspective. This is intended to uncover potential “unknown unknowns,” a challenging endeavor by nature.

Recommendations from this group will be sent simultaneously to both the board and the leadership team, which includes CEO Sam Altman and CTO Mira Murati. While leadership will make the final decision regarding product deployment, the board can override these choices.

This mechanism aims to prevent situations like the recent controversy, where a high-risk product potentially advanced without the board's knowledge or approval. Interestingly, the leadership changes resulted in sidelining critical voices while introducing business-savvy leaders like Bret Taylor and Larry Summers—whose expertise lies outside AI.

If expert recommendations yield a decision from the CEO, will the board genuinely feel empowered to challenge this and press pause? Furthermore, will we be informed of any such actions? Transparency appears largely unaddressed, aside from OpenAI's commitment to seek independent third-party audits.

Should a model be classified with a “critical” risk level, OpenAI has previously publicized its reluctance to release such powerful models, leveraging this as part of its brand. However, can we be assured that similar accountability will be maintained in light of the genuine risks posed? Whether this approach is sound remains unclear, and it has yet to be thoroughly explored.

ImpriMed’s Dog Cancer Treatment: Pioneering AI Technology for Advancements in Human Oncology

Essential Guide to Ethical and Responsible AI Governance: Best Practices and Frameworks for Implementation

Most people like

Tally - AI Agent for Procurement Compliance Automation & Proposal Evaluations

Unlock the power of our innovative AI tool, designed to streamline the analysis of documents and videos with unparalleled efficiency. Enhance your productivity and insights as you seamlessly navigate through vast amounts of content, turning complex data into accessible knowledge effortlessly. Perfect for businesses and researchers alike, this cutting-edge solution revolutionizes the way you interact with information.

AI analysis Writing Assistants

MolyPix.AI

50.5K

Create Stunning, Customizable Designs You’ll Love to Edit

AI design tool Text to Image

PopAi

1.2M

Enhance your productivity effortlessly with PopAi's cutting-edge AI tool! Unlock new levels of efficiency and streamline your tasks like never before.

AI assistant AI Presentation Generator

CraftWriter

247.6K

Unlock your writing potential with CraftWriter! Transform your skills and express your creativity more effectively than ever before. Dive into our resources and elevate your writing journey today!

writing tool Other

Find AI tools in YBX