A year ago, Databricks acquired MosaicML for $1.3 billion. Now operating under the name Mosaic AI, this platform has become essential to Databricks’ AI solutions. At the company’s Data + AI Summit, several new features for Mosaic AI are being unveiled. Ahead of these announcements, I had the opportunity to engage with Databricks co-founders, CEO Ali Ghodsi and CTO Matei Zaharia.
During this conference, Databricks is introducing five innovative tools for Mosaic AI: the Mosaic AI Agent Framework, Mosaic AI Agent Evaluation, Mosaic AI Tools Catalog, Mosaic AI Model Training, and the Mosaic AI Gateway.
“It’s been an incredible year, with significant advancements in Generative AI,” Ghodsi remarked. “However, the core concerns remain the same: first, how do we enhance the quality and reliability of these models? Second, how do we ensure cost efficiency? There’s a vast disparity in costs between models—sometimes orders of magnitude apart. Lastly, we must ensure that our data privacy is maintained throughout the process.” The newly announced features aim to address these primary concerns of Databricks’ customers.
Zaharia elaborated that enterprises deploying large language models (LLMs) in production often implement multi-component systems, requiring multiple calls to one or more models and utilizing various external tools for database access or retrieval-augmented generation (RAG). These composite systems not only enhance the performance of LLM-based applications but also save costs by leveraging cheaper models for specific tasks or caching outcomes. Most importantly, they make results more reliable and relevant by integrating proprietary data with foundational models.
“The future of impactful, mission-critical AI applications lies in modular systems,” Zaharia explained. “When dealing with critical tasks, engineers need control over all aspects, which is facilitated through a modular approach. Hence, we're engaged in fundamental research to discover the optimal way to create these systems for specific tasks, enabling developers to connect components seamlessly and monitor the entire process.”
To assist developers in building these systems, Databricks is launching two key services this week: the Mosaic AI Agent Framework and the Mosaic AI Tools Catalog. The AI Agent Framework utilizes the company’s recently launched serverless vector search functionality, providing developers with the resources to build RAG-based applications effortlessly.
Both Ghodsi and Zaharia emphasized that the Databricks vector search framework adopts a hybrid strategy, merging traditional keyword-based search with embedding search techniques. This integration is tightly coupled with the Databricks data lake, ensuring that data across both platforms is synced in real-time. Their governance features also secure the overall Databricks environment—particularly through the Databricks Unity Catalog—to prevent the unauthorized exposure of personal information in the vector search functionalities.
As for the Unity Catalog, which is gradually being open-sourced, Databricks is extending this system to enable enterprises to manage which AI tools and functions their LLMs can utilize when generating responses. The catalog aims to enhance discoverability of these services across organizations.
Ghodsi pointed out that developers can now leverage these tools to create custom agents by linking together models and functions using frameworks like Langchain or LlamaIndex. Notably, Zaharia mentioned that many Databricks customers have already begun utilizing these capabilities.
“There’s a substantial number of companies employing these solutions, including agent-like workflows,” Zaharia noted. “People are often surprised by how prevalent it is, but it appears to be the prevailing direction. Moreover, in our internal AI applications, such as the assistant features on our platform, we’ve observed that this is the ideal way to develop them.”
To assess these new applications, Databricks is also introducing the Mosaic AI Agent Evaluation, an AI-driven evaluation tool combining LLM judges to test the AI's production performance. This tool also enables enterprises to rapidly gather feedback from users, and even allows them to contribute to initial dataset labeling. The Agent Evaluation features a UI component stemming from Databricks' earlier acquisition of Lilac this year, offering users a way to visualize and search extensive text datasets.
“Every customer expresses a need for internal labeling. They want their employees to label maybe 100 to 500 data points, which we can then use to refine the LLM judges,” explained Ghodsi.
Databricks is enhancing the output quality through fine-tuned models via the newly launched Mosaic AI Model Training service. This service enables users to customize models with their organizations' proprietary data, improving performance for specific tasks.
The last debut tool is the Mosaic AI Gateway, described as a “unified interface to query, manage, and deploy any open source or proprietary model.” This development empowers users to query any LLM securely, utilizing a centralized credentials store. After all, enterprises aim to prevent engineers from sending sensitive data to external services indiscriminately.
In today’s cost-conscious landscape, the AI Gateway allows IT departments to implement rate limits per vendor, helping maintain budget control. Furthermore, it provides usage tracking and tracing functionalities for effective system troubleshooting.
As Ghodsi articulated, these new features stem from observances of how Databricks users are currently engaging with LLMs: “We noticed a significant market shift in the past quarter and a half. Earlier last year, most conversations revolved around support for open source; however, upon closer inspection, everyone was utilizing OpenAI. Despite outward support for open source, many were integrating OpenAI in practice." Now, users have become more adept, adopting open models—though many are not genuinely open source—which necessitates a new set of tools to navigate the subsequent challenges and opportunities effectively.