OpenAI Researchers Demand 'Right to Warn' Against Safety Risks, Urging Action to Prevent 'Human Extinction'

A group of 11 researchers, including current and former employees of OpenAI, along with a member of Google DeepMind and a former researcher at Anthropic, have released an open letter urging AI companies to adopt four principles designed to protect whistleblowers and critics addressing AI safety issues.

Titled “Right to Warn,” the letter emphasizes the serious risks associated with AI technologies, stating, “These risks range from the perpetuation of existing inequalities to manipulation, misinformation, and the potential loss of control over autonomous AI systems, which could lead to human extinction.”

Among the key concerns raised in the letter are inadequate oversight, profit-driven motives, and the suppression of dissenting voices within organizations developing advanced AI technologies.

To address these issues, the signatories propose the following four principles for AI companies:

1. Do not enforce agreements that restrict critical commentary or retaliate against individuals for raising concerns about risks.

2. Establish a confidential and verifiable process for reporting risk-related issues to the company’s board, regulators, and independent organizations.

3. Foster a culture of transparency that encourages employees to discuss potential risks publicly while protecting trade secrets.

4. Prohibit retaliation against employees who disclose confidential risk-related information after other reporting methods have failed.

Published today by The New York Times, the letter has garnered support from AI leaders such as Yoshua Bengio, Geoffrey Hinton, and Stuart Russell. Notable signatories include former OpenAI employees Jacob Hilton, Daniel Kokotajlo, William Saunders, and Daniel Ziegler, as well as Ramana Kumar from Google DeepMind and Neel Nanda, currently at DeepMind.

In a series of posts on X (formerly Twitter) after the article's publication, Kokotajlo elaborated on his resignation from OpenAI, citing a loss of confidence in the company's commitment to responsible AI development. He emphasized the need for greater transparency and ethical standards in advancing AI technologies.

Kokotajlo revealed that he relinquished his vested equity to freely critique the company, expressing disappointment that OpenAI did not prioritize safety research as its systems evolved. He also reported being presented with a non-disparagement agreement upon leaving, which he viewed as unethical.

These assertions follow earlier revelations about OpenAI's practices, including leaked documents showing coercive tactics used against former employees. However, OpenAI has stated it will not enforce NDAs, which are commonplace in the tech industry.

The letter arrives during a tumultuous period for OpenAI, which began with the controversial firing of CEO Sam Altman in November 2023 over alleged communication issues with the board. Altman was swiftly reinstated due to investor pressure, but some board members expressed ongoing concerns about accountability and transparency—issues echoed by public figures, including Scarlett Johansson, who criticized the company for misusing her voice without consent.

Despite these challenges, the company is attempting to address safety concerns by forming a new Safety and Security Committee, announced alongside the training of its latest AI model.

Full “Right to Warn” Letter Text:

A Right to Warn about Advanced Artificial Intelligence

We are current and former employees at frontier AI companies, and we believe in the potential of AI technology to deliver unprecedented benefits to humanity.

However, we also recognize the serious risks posed by these technologies, which include entrenching inequalities, facilitating manipulation and misinformation, and potentially leading to loss of control over autonomous AI systems, with catastrophic consequences.

AI companies acknowledge these risks, as do governments and experts worldwide. We remain hopeful that they can be mitigated with enough guidance from the scientific community, policymakers, and the public. Yet, significant financial incentives often hinder effective oversight.

AI companies possess critical non-public information about their systems’ capabilities and risks but have weak obligations to disclose this information to governments or civil society. Therefore, current and former employees are vital to ensuring accountability, but confidentiality agreements often silence us. Conventional whistleblower protections fall short since many risks remain unregulated.

We call upon advanced AI companies to commit to the following principles:

1. No enforcement of agreements preventing risk-related criticism or retaliating against employees for such comments.

2. Establishment of a confidential process for employees to report risks to the board, regulators, and qualified independent organizations.

3. Support for a culture of open criticism, allowing employees to raise concerns publicly while safeguarding trade secrets.

4. Protection for those who disclose risk-related information if internal reporting avenues fail.

Signed by (alphabetical order):

- Jacob Hilton, formerly OpenAI

- Daniel Kokotajlo, formerly OpenAI

- Ramana Kumar, formerly Google DeepMind

- Neel Nanda, currently Google DeepMind, formerly Anthropic

- William Saunders, formerly OpenAI

- Carroll Wainwright, formerly OpenAI

- Daniel Ziegler, formerly OpenAI

- Anonymous, currently OpenAI (four individuals)

- Anonymous, formerly OpenAI (two individuals)

Endorsed by (alphabetical order):

- Yoshua Bengio

- Geoffrey Hinton

- Stuart Russell

June 4th, 2024

Most people like

Find AI tools in YBX