[Updated: 12/14, 9:10 AM]
Chet Kapoor, CEO of DataStax—a cloud database firm leveraging open-source Apache Cassandra—proclaimed at the AI.Dev conference in Silicon Valley that Cassandra is the “best database for generative AI.”
AI Agents Are Coming
Kapoor made his remarks in front of an audience of 700 during the Linux Foundation’s event, highlighting the competitive race among startups and established companies vying for leadership in the rapidly evolving generative AI space. As enterprise brands explore technology options, the choice of database providers becomes crucial. While large language model (LLM) providers like OpenAI, Anthropic, Google (Gemini), and Meta (Llama) generate much buzz, the battle for database supremacy among companies supporting LLM applications is equally intense.
In his keynote, Kapoor outlined several reasons why DataStax’s Cassandra database stands out. Notably, it is one of the most reliable operational databases favored by enterprises. Many organizations have successfully deployed generative AI at scale using Cassandra, and its technological advantages help it outperform rivals like MongoDB and Pinecone.
DataStax is also contemplating a public offering, and Kapoor appears eager to stir interest. Last June, the company raised $115 million at a $1.6 billion valuation. While DataStax hasn’t disclosed financial details, Kapoor noted that it’s on the radar for banks looking to take companies public in 2024-2025.
Kapoor’s Key Points:
1. Popularity and Reliability of Cassandra
Cassandra is a leading operational database, especially as companies like Microsoft and Amazon promote their cloud services with integrated databases for generative AI. These tech giants have encouraged users to adopt their platforms by eliminating barriers, including complex data migration processes.
Kapoor humorously critiqued these cloud providers for over-complicating solutions: “There’s one to go to the bathroom in the morning… and then one for the afternoon, and one for the evening.” He explained that generative AI has prompted enterprise CIOs to seek integrated databases for seamless querying, an area where Cassandra excels. In contrast, Microsoft and Amazon’s databases typically focus on analytical workloads, which can lead to costly inefficiencies in operational tasks related to generative AI.
DataStax prioritizes cost-effectiveness and performance, which appeals to Fortune 500 clients. Notable users of Cassandra include Netflix for movie metadata, FedEx for package tracking, Apple for iCloud and iMessage data, and Home Depot for website operations. As organizations develop new AI applications, their established success with Cassandra encourages ongoing consolidation around this technology.
2. Active Generative AI Deployments
Kapoor highlighted nine companies utilizing DataStax’s Astra DB cloud database for generative AI. While many enterprises are experimenting with generative AI, few have moved to large-scale production, mainly due to concerns over safety and reliability. As the industry tension increases, a shift in spending toward actual deployments is anticipated next year.
A few notable customers deploying LLMs include:
- Physics Wallah: An Indian education platform reaching 6 million users with a versatile LLM-driven bot, developed in just 55 days.
- Skypoint: A healthcare service for seniors that employs an LLM for personalized treatment planning, freeing up over 10 hours weekly for doctors.
Others include Hey You, Reel Star, Arre, Hornet, Restworld, Sourcetable, and Concide. Kapoor noted that small- and medium-sized businesses could adapt quickly, while larger enterprises face more regulatory hurdles.
3. Superior Technology Performance
Kapoor emphasized DataStax’s advancements in Astra’s vector search capabilities, a critical component for generative AI databases. Astra’s JVector technology offers 16% higher relevance compared to leading competitor Pinecone. This is significant for ensuring precise results. An upcoming benchmarking report will provide further insights, but preliminary findings indicate Astra delivers superior transaction processing compared with both Pinecone and MongoDB.
Astra DB uniquely offers zero-latency access to vectorized data, from indexing to querying.
Looking Ahead: Rapid Adoption of Generative AI
Kapoor predicted that generative AI adoption will occur faster than previous technological revolutions, building on existing frameworks like web, mobile, and cloud technologies. He anticipates that transformative revenue-generating use cases will emerge next year, including advanced LLM functionalities that enable AI agents to perform complex tasks. Material revenue from generative AI integrations could manifest as early as Q2 2024, especially in sectors like retail and travel.
While Kapoor and Anuff highlighted Cassandra's strengths, they acknowledged that generative AI will uplift the broader database sector. The demands of AI applications necessitate increased storage and computational resources, drawing attention from cloud and database providers. “If AI applications become a big deal, they are going to be the primary growth driver for both private and public database companies for easily the next five years,” Anuff stated.