Lakera Launches New Solution to Safeguard Large Language Models Against Malicious Prompts

Large language models (LLMs) are at the forefront of the rapidly growing generative AI movement, enabling the interpretation and creation of human-like text from simple prompts. These prompts can range from summarizing documents to crafting poetry or answering questions using diverse data sources.

However, malicious actors can exploit these prompts through techniques like “prompt injection.” This involves carefully crafted text prompts aimed at tricking an LLM-powered chatbot into granting unauthorized system access or bypassing security protocols.

In response to these challenges, Swiss startup Lakera is officially launching today, promising to protect enterprises from LLM vulnerabilities, such as prompt injections and data leaks. The company has also disclosed an impressive $10 million funding round secured earlier this year.

Data Wizardry

Lakera has developed a unique database that draws insights from various sources, including publicly accessible open-source datasets, proprietary research, and intriguing data from an interactive game launched earlier this year called Gandalf.

In Gandalf, users are encouraged to “hack” the underlying LLM through linguistic tricks to uncover secret passwords. Successful players advance to new levels, with the game becoming increasingly sophisticated in defending against such attacks.

Gandalf is powered by OpenAI’s GPT-3.5, as well as LLMs from Cohere and Anthropic. Although it seems like a playful endeavor to expose LLM weaknesses, insights gathered from this game will inform Lakera’s core product, Lakera Guard, which integrates into company applications via API.

“Gandalf attracts players of all ages, from children to my grandmother,” said Lakera CEO and co-founder David Haber. “Interestingly, a significant portion of our players comes from the cybersecurity community.”

Prompt Injection Taxonomy

Haber revealed that the company has recorded around 30 million interactions from over 1 million users in the last six months, leading to the creation of a “prompt injection taxonomy” that categorizes attacks into ten distinct types. These include direct attacks, jailbreaks, sidestepping attacks, multi-prompt attacks, role-playing, model duping, token smuggling, multi-language attacks, and accidental context leakage. This framework allows Lakera’s clients to compare their inputs against these attack categories on a large scale.

“Our aim is to transform prompt injections into statistical structures,” Haber explained.

Broader Cybersecurity Focus

While prompt injections are a primary focus, Lakera is also dedicated to mitigating other cybersecurity risks. This includes preventing private or sensitive data from being unintentionally exposed and moderating content to ensure LLMs do not produce inappropriate material for children.

“One of the most requested features revolves around detecting toxic language,” Haber noted. “We’re collaborating with a major company that offers generative AI solutions for kids to guarantee safe content delivery.”

Additionally, Lakera addresses misinformation and factual inaccuracies associated with LLMs. According to Haber, Lakera helps to manage “hallucinations”—instances when an LLM contradicts its system instructions or outputs factually incorrect information.

“Customers provide us with context for the LLM's interactions, and we ensure the model remains within those defined parameters,” Haber said.

In essence, Lakera offers a comprehensive solution that spans security, safety, and data privacy in the realm of generative AI.

Timely Launch with EU AI Act

With significant AI regulations on the horizon, notably the EU AI Act, Lakera couldn’t have chosen a better time to launch. Article 28b of this legislation emphasizes the need for generative AI models to undergo rigorous risk assessments and implement necessary safeguards.

Haber and his co-founders have played advisory roles in shaping this act, helping establish the technical groundwork for its anticipated rollout within the next year or two.

“There remains uncertainty around how to effectively regulate generative AI models,” Haber acknowledged. “Technological advancements often outpace regulatory measures, posing a challenge. We aim to provide insights from a developer’s perspective to inform policymaking, ensuring the regulations align with real-world implications for those deploying these models.”

Navigating Security Challenges

Despite the rapid rise of technologies like ChatGPT, enterprises are often wary about adopting generative AI due to security concerns.

“We engage with some of the most innovative startups and leading enterprises; many already have generative AI applications in production or are on the verge of launching them,” Haber shared. “We work closely with them to ensure smooth rollout without security issues. For many companies, security remains a significant barrier to implementing generative AI.”

Founded in Zurich in 2021, Lakera boasts notable paying clients, although it cannot disclose names due to security protocols. Nonetheless, the company has confirmed that LLM developer Cohere—a firm recently valued at $2 billion—is among its clients, along with a “leading enterprise cloud platform” and “one of the world’s largest cloud storage services.”

With $10 million in funding, Lakera is well-positioned to enhance its platform following its official launch.

“We aim to be a trusted partner as organizations integrate generative AI into their operations, ensuring their systems are secure and risks are minimized,” said Haber. “We will continue to evolve our offerings based on the emerging threat landscape.”

Lakera's investment round was led by Swiss VC Redalpine, with additional contributions from Fly Ventures, Inovia Capital, and several angel investors.

Most people like

Find AI tools in YBX