Patronus AI Unveils LLM Evaluation Tool Tailored for Regulated Industries

When you bring together two AI trailblazers, both with extensive experience at Meta in responsible AI research, remarkable innovations emerge. Last March, Rebecca Qian and Anand Kannappan founded Patronus AI to create a groundbreaking solution for evaluating and testing large language models (LLMs) tailored for regulated industries that demand precision and reliability.

Rebecca Qian, the company's CTO, previously led responsible natural language processing (NLP) research at Meta AI, while CEO Anand Kannappan contributed to developing explainable machine learning frameworks at Meta Reality Labs. Today marks a significant milestone for their startup as they emerge from stealth mode, launch their product to the public, and announce a successful $3 million seed funding round.

Patronus AI is poised for success in today's landscape, offering a robust security and analysis framework through a managed service for testing LLMs. Their focus is on identifying potential risks, particularly the occurrence of hallucinations, where a model fabricates answers due to insufficient data input.

“Our product is designed to automate and scale the entire model evaluation process, alerting users to any identified issues,” Qian stated. She elaborated on their three-step approach: “First, we provide scoring, where we assess models in real-world situations, such as finance, examining key factors like hallucinations.” Next, the platform generates adversarial test cases, stress-testing the models against these scenarios. Finally, it benchmarks various models according to specific criteria to identify the most suitable model for a given task. “We compare different models, helping users pinpoint the best choice for their unique applications. For instance, one model might exhibit a higher hallucination rate compared to another baseline model,” she explained.

The company is laser-focused on highly regulated sectors where inaccurate outputs can lead to severe repercussions. “We ensure that the large language models our clients utilize are safe by detecting instances where they produce business-sensitive or inappropriate content,” Kannappan emphasized.

Patronus AI aims to establish itself as a trusted third-party evaluator of models. “While many claim their LLM is the best, we provide an unbiased, independent assessment. That’s where our role comes in—Patronus serves as the credibility checkmark,” he added.

Currently with a team of six full-time employees, the company plans to expand its workforce in response to the rapidly evolving market, although they have not specified an exact number. Qian underscored the importance of diversity within their growing team: “We are deeply committed to fostering diversity, starting at the leadership level. As we expand, we will implement programs and initiatives to cultivate and sustain an inclusive work environment.”

Today's $3 million seed funding round was led by Lightspeed Venture Partners, with participation from Factorial Capital and other notable industry investors.

Most people like

Find AI tools in YBX