"Galileo's Luna Revolutionizes GenAI Evaluation: 97% Cost Reduction and 11x Speed Improvement"

Galileo Disrupts AI Evaluation with Luna: A Game-Changer for Enterprises

Galileo, a leader in enterprise generative AI, has introduced Galileo Luna, a revolutionary suite of Evaluation Foundation Models (EFMs) designed to transform how businesses evaluate their generative AI systems. Luna addresses crucial challenges—speed, cost, and accuracy—that have previously hindered the adoption of generative AI in production settings.

“Galileo created Luna to overcome the shortcomings of existing GenAI evaluation methods, which are often slow, costly, and imprecise,” said Vikram Chatterji, Co-Founder and CEO of Galileo. “We recognized the need for ultra-low-latency, cost-effective, and high-accuracy evaluations in production environments.”

A Significant Milestone in AI Evaluation

The launch of Luna is a pivotal step for Galileo, which has been pioneering enterprise GenAI since early 2021. The company’s commitment to advancing AI evaluation is evident from nearly a year of rigorous R&D culminating in Luna's development.

Luna showcases superior performance in a benchmark test, achieving an AUROC score of 0.78. This surpasses leading competitors like GPT-3.5, Trulens Groundedness, and RAGAS Faithfulness.

Purpose-Built Models Redefining Evaluation Standards

At the core of Luna’s innovation are its purpose-built small language models, specifically designed for evaluation tasks like hallucination detection, context quality assessment, data leakage prevention, and malicious prompt identification. This specialized focus enables Luna to excel across three key metrics: speed, cost, and accuracy.

“By utilizing tailored small language models, Luna achieves evaluations that are 97% cheaper and 11x faster than those conducted with GPT-3.5,” Chatterji explained. Additionally, Luna outperforms previous methods by up to 20% in detecting issues such as hallucinations and personally identifiable information (PII).

In a cost analysis for evaluating 1 million queries monthly, Luna incurs just $175, making it substantially more cost-effective than GPT-3.5 ($6,248), RAGAS Faithfulness ($7,994), and Trulens Groundedness ($16,641).

Innovation Without Traditional Datasets

A standout feature of Luna is its capability to function without traditional ground truth datasets. By utilizing pre-trained evaluation models refined on diverse domain-specific datasets, Luna simplifies the evaluation process, removing the need for time-consuming custom test sets.

Luna’s potential applications are vast. Chatterji noted its effectiveness in industries requiring high reliability, such as healthcare, finance, and telecommunications. “Luna is especially powerful for large-scale enterprise applications that process millions of queries monthly,” he added.

Unrivaled Speed and Continuous Improvement

Galileo's Luna provides exceptional speed, processing a single query in just 0.232 seconds, a significant improvement over competitors like GPT-3.5 (2.5 seconds) and RAGAS Faithfulness (5.4 seconds).

Use cases for Luna range from real-time AI output monitoring to ensuring chatbot interaction safety. With Galileo’s Fine Tune product, Luna can be customized to meet specific client needs, achieving accuracy levels of 95% or higher in critical sectors, including pharmaceuticals and financial services.

As the generative AI landscape evolves, Galileo remains dedicated to ongoing innovation with Luna, focusing on expanding task support, enhancing accuracy, and further reducing costs and latency.

“Galileo is committed to advancing AI evaluation, helping organizations deploy trustworthy AI solutions,” Chatterji stated. “As generative AI continues to evolve, we will provide clients with cutting-edge capabilities that inspire confidence among users.”

With the launch of Luna, Galileo has strengthened its position as a forerunner in enterprise generative AI evaluation. As companies seek to harness generative AI's potential, Luna’s fast, cost-effective, and accurate evaluations will be instrumental in driving widespread adoption of this transformative technology.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles