Qdrant: Aiming for Cost-Effective Solutions in Vector Database RAG

More companies are striving to integrate retrieval augmented generation (RAG) systems into their technology stacks, prompting the emergence of innovative methods to enhance this process.

Vector database firm Qdrant believes its newly developed search algorithm, BM42, will significantly improve RAG's efficiency and cost-effectiveness.

Qdrant, founded in 2021, aimed to enrich hybrid search capabilities—combining semantic and keyword search—through BM42. Andrey Vasnetsov, co-founder and CTO of Qdrant, explained that BM42 serves as an update to the widely used BM25 algorithm, which ranks document relevance in search queries. Traditional systems typically use BM25, but RAG employs vector databases that represent data as mathematical metrics, simplifying data matching.

Vasnetsov stated, “Traditional keyword matching algorithms like BM25 assume documents are sufficiently sized to generate statistics. However, RAG works with smaller information chunks, making BM25 inadequate.”

BM42 utilizes a language model to extract pertinent information from documents rather than generating embeddings. This extracted data is tokenized and then scored, enabling Qdrant to accurately identify the information necessary to respond to specific queries.

Hybrid search presents various options for enhancement.

BM42 is not the sole advancement vying to surpass BM25 in streamlining hybrid research and RAG applications. Splade, or Sparse Lexical and Expansion model, is another contender. It employs a pre-trained language model capable of recognizing word relationships while incorporating related terms that may differ between the search query and the relevant documents.

While some vector database companies utilize Splade, Vasnetsov asserts that BM42 offers a more cost-effective solution. “Splade can be very expensive due to the size and computational demands of these models,” he noted.

RAG is rapidly emerging as a focal point in enterprise AI, as organizations seek to leverage generative AI models with their proprietary data. By utilizing RAG, companies can provide employees and users with more accurate and timely information drawn from organizational data.

Major players like Microsoft and Amazon are now offering cloud computing infrastructures designed for building RAG applications. Additionally, OpenAI acquired Rockset in June to enhance its RAG capabilities.

While RAG allows users to connect AI model outputs to company data, it’s important to acknowledge that it remains a language model and is susceptible to inaccuracies, often referred to as "hallucinations."

Most people like

Find AI tools in YBX