UK AI Safety Institute Uncovers Vulnerabilities in Major LLMs with Simple Jailbreaking Techniques

Home AI News UK AI Safety Institute Uncovers Vulnerabilities in Major LLMs with Simple Jailbreaking Techniques

In a surprising revelation, AI systems may not be as secure as their developers claim. The UK government's AI Safety Institute (AISI) recently reported that four undisclosed large language models (LLMs) tested were "highly vulnerable to basic jailbreaks." Notably, some unjailbroken models produced "harmful outputs" even without intentional manipulation by researchers.

While most publicly available LLMs come equipped with safeguards to prevent harmful or illegal responses, jailbreaking refers to the act of tricking the model into bypassing these protections. AISI employed prompts from a standardized evaluation framework, as well as proprietary prompts, revealing that the models generated harmful responses to several questions, even without attempts to jailbreak. After conducting "relatively simple attacks," AISI found that the models answered between 98% and 100% of harmful queries.

UK Prime Minister Rishi Sunak unveiled plans for the AISI in late October 2023, with its official launch on November 2. The institute aims to "carefully test new types of frontier AI both before and after their release" to investigate the potentially harmful capabilities of AI models. This includes assessing risks ranging from social issues like bias and misinformation to extreme scenarios, such as humanity losing control over AI.

The AISI's report emphasizes that existing safety measures for these LLMs are inadequate. The Institute intends to conduct further testing on additional AI models and develop enhanced evaluations and metrics to address each area of concern effectively.

Farewell to ChatGPT's Imitation Scarlett Johansson Voice

Slack Utilizes Your Chat Data to Enhance Machine Learning Models for Improved User Experience

Most people like

Zeals

8.1K

In today's digital landscape, AI chat-commerce is revolutionizing the way users interact with brands. By seamlessly integrating artificial intelligence into chat solutions, businesses can create engaging and personalized shopping experiences that resonate with customers. This innovative approach not only enhances user satisfaction but also drives conversions, making it essential for brands looking to stay competitive and foster deep connections with their audience.

AI ChatBot AI Chatbot

GPT Chinese Station

19.9K

Discover a versatile AI platform designed for copywriting, translation, and various other tasks. This powerful tool enhances creativity and efficiency, making it an essential resource for content creators, marketers, and businesses alike. Unleash the potential of AI to transform your writing experience today!

AI ChatBot General Writing

Docswrite

11.8K

Enhance Your Content Publishing Efficiency and Save Time with Docswrite!

content publishing AI Blog Writer

Lingvanex

1.2M

Lingvanex provides a variety of advanced translation tools powered by neural machine translation, designed to boost productivity and streamline communication.

translator Translate

Find AI tools in YBX