"Accelerate LLM Evaluation with Braintrust Data: A Faster Solution for Enterprises"

California-based Braintrust Data has raised $5.1 million in seed funding, led by Greylock Partners. Founded by Ankur Goyal, who previously sold his AI venture Impira to Figma, Braintrust is designed to help enterprises efficiently evaluate and enhance AI models before they reach production.

Despite being a young startup, Braintrust has quickly attracted numerous clients and investors from well-known figures in the industry, such as Elad Gil, Clem Delangue, and Greg Brockman. The company aims to expand its team and continue delivering innovative solutions that empower developers to keep pace in the rapidly evolving AI landscape.

Navigating the Challenges of AI in Production

While AI serves as the backbone of modern applications, integrating and maintaining these systems can be challenging. Minor code modifications intended to enhance an application can inadvertently disrupt the entire workflow, forcing backend teams to scramble for solutions. This reactive strategy can adversely affect the customer experience, which is why evaluating AI performance during development is crucial. Teams assess context-specific data and metrics while experimenting with different models and techniques to optimize outcomes.

Streamlining Time and Effort

While traditional evaluation methods are effective, they often consume significant time and resources, delaying feature launches—an issue Goyal encountered at Impira. In response, Goyal founded Braintrust Data to facilitate quicker evaluations and real-world testing of code changes.

"Our product enables you to instrument your code for evaluations in under an hour," Goyal explained to the media. "You can quickly re-run evaluations after changes and receive instant feedback on your model's performance and debug specific cases before final deployment. This includes logging examples from both staging and production to identify new user edge cases."

Rapid Customer Adoption

Launched in August 2023, Braintrust has already gained hundreds of enterprise and startup customers, including Airtable, Zapier, Coda, and Instacart. Clients have reportedly increased the accuracy of their AI offerings by over 30% in just weeks, leading to faster release cycles and enhanced team collaboration.

"Our product can operate within your own cloud environment, ensuring enterprise-level security—vital in an AI landscape rife with PII and proprietary information. This capability allows our customers to use Braintrust for critical workloads," Goyal added.

Enhancing AI Team Efficiency

In addition to evaluation tools, Braintrust provides features to help AI teams iterate more quickly, such as a prompt playground for comparing prompts, benchmarking input-output pairs, dataset management, and an AI proxy offering access to popular models from OpenAI, Anthropic, LLaMa 2, and Mistral.

A Growing Focus on AI Quality

As businesses increasingly adopt AI solutions, there is a strong demand for tools that evaluate model performance and address gaps. Braintrust is not the only player in this space; many companies have emerged since the launch of ChatGPT, offering various products to measure model performance and improve observability.

Goyal emphasizes Braintrust's unique approach: "While many products focus on observability, which only provides insights after deployment, our evaluations allow engineering teams to innovate at speeds up to ten times faster than those relying only on post-release fixes."

With the recent funding from Greylock, bringing the total capital raised to $8.3 million, Goyal plans to expand the team and advance the product roadmap, enhancing Braintrust’s capabilities in evaluations and AI tooling, including prompt playground functions, production logging, multi-modal model support, and beyond.

Most people like

Find AI tools in YBX