Last month, I explored the increasing overlap between the effective altruism (EA) movement and AI security policy, connecting influential AI startups like Anthropic with Washington, D.C. think tanks such as the RAND Corporation. This growing network links EA's mission of addressing what advocates perceive as catastrophic risks from future artificial general intelligence (AGI) with various governmental agencies, think tanks, and congressional offices.
Critics argue that EA’s emphasis on existential risks, or "x-risk," detracts from addressing immediate, tangible AI dangers—such as bias, misinformation, and conventional cybersecurity threats.
Since then, I have sought insights from AI and policy leaders not aligned with effective altruism, nor its antithetical counterpart, effective accelerationism (e/acc). Are other companies equally concerned about the potential for large language model (LLM) weights to fall into malicious hands? Do policymakers in D.C. adequately understand EA's impact on AI security initiatives?
This inquiry gains urgency as Anthropic publishes new research on "sleeper agent" AI models that evade safety protocols, and Congress raises concerns about a potential collaboration between the National Institute of Standards and Technology (NIST) and RAND. Moreover, recent headlines spotlight EA in relation to the controversial firing of OpenAI CEO Sam Altman, as the nonprofit board members involved were primarily associated with EA.
Through conversations over the past month, I uncovered a complex mix of perspectives. While there is significant concern regarding EA's billionaire-backed ideology and its influence on AI security discourse in Washington, some acknowledge the importance of discussing long-term AI risks within the policy framework.
Effective Altruism and AI Catastrophe Prevention
Originally founded to improve global welfare, the EA movement is now predominantly financed by tech billionaires prioritizing the mitigation of AI-related disasters, especially in biosecurity. In my previous article, I highlighted concerns voiced by Anthropic’s CISO Jason Clinton and RAND researchers regarding LLM model weight security against threats from opportunistic criminals and state-sponsored actors.
Clinton emphasized that safeguarding the model weights for Claude, Anthropic's LLM, is his top concern. He warned that if malicious entities access the entire model file, it could pose a significant threat.
RAND researcher Sella Nevo projects that within two years, AI models could gain national security relevance, particularly regarding their potential misuse by bad actors.
All three individuals I spoke with share connections to the EA community, with Jason Matheny, RAND's CEO, previously involved with Anthropic's Long-Term Benefit Trust. I'm prompted to delve deeper into EA's growing influence by Brendan Bordelon's reporting, which referred to the infiltration of EA-linked funders into Washington’s policy landscape as an “epic infiltration.” As Bordelon states, a dedicated faction of effective altruism supporters is significantly shaping approaches to AI governance.
Cohere's Response to EA Concerns
I spoke with Nick Frosst, co-founder of Cohere—an AI competitor to Anthropic and OpenAI—who disagrees with the notion that large language models present an existential threat. He highlighted that while Cohere secures its model weights, the primary concern is business-related rather than existential.
Frosst noted a philosophical distinction, asserting, “I think we may eventually develop true artificial general intelligence, but I don’t believe it will happen soon.” He has critiqued EA for its perceived self-righteousness around AI risks and questioned its moral framework regarding wealth accumulation.
He argued that EA's approach simplifies complex humanitarian impacts into quantifiable metrics, leading to morally questionable conclusions about AI’s existential risks.
AI21 Labs on Model Weights and Security
Yoav Shoham, co-founder of AI21 Labs, another competitor in the AI space, echoed similar sentiments, emphasizing that while they protect their model weights for trade-secret reasons, these weights are not the primary enabler for malicious actors. He pointed out that in today's geopolitical AI landscape, the majority of issues cannot be solely addressed through policy.
Shoham clarified that AI21 Labs is not part of the EA movement and sees a blend of responsible AI use with unfounded fear within the movement.
Critique of EA Perspectives at RAND
Amidst the criticism towards RAND for its connections to EA, some researchers internally dispute the movement's mainstream ideologies. Marek Posard, a RAND military sociologist, noted that philosophical debates surrounding AI, including those initiated by EA and e/acc advocates, distract from immediate AI policy concerns.
He asserted that while diverse perspectives are welcome at RAND, the focus should remain on addressing real-world problems rather than the ideological battles surrounding AI governance.
Addressing Present-day Risks in Cybersecurity
While the fields of AI security and traditional cybersecurity overlap, traditional cybersecurity focuses more on contemporary risks. Dan deBeaubien, who leads AI research at the SANS Institute, acknowledged the influence of the EA movement but emphasized understanding current LLM-related security threats over existential risks.
Coexisting with EA Discourses in D.C.
Some policymakers recognize EA's influence on AI security but prefer to coexist rather than confront its tenets directly. Mark Beall, former head of AI policy at the U.S. Department of Defense, stressed the importance of established safeguards rather than the reckless speed encouraged in tech culture.
Highlighting his work at the Pentagon on responsible AI policy, Beall countered claims that D.C. officials lack awareness of AI risks, asserting they have prioritized safety long before effective altruists entered the policy arena.
The Challenge of ‘Ungoverned AI’
Ian Bremmer, president of Eurasia Group, recently listed “ungoverned AI” among the top geopolitical risks for 2024, pinpointing tangible threats such as election disinformation. He acknowledged a valuable debate regarding model weight security but criticized the EA movement for minimizing other risks by focusing exclusively on catastrophic outcomes.
In conclusion, Bremmer pointed out that framing risks as existential could overshadow pressing issues, undermining the comprehensive discourse needed for effective AI governance.