AI often produces inaccurate information, which is a significant concern for regular users, particularly businesses, as misleading results can negatively impact profits. According to a recent Salesforce survey, half of employees expressed worries about the accuracy of responses generated by their company's AI systems.
While there's no foolproof solution for these “hallucinations,” certain techniques can mitigate the issue. One effective method is retrieval-augmented generation (RAG), which combines an AI model with a knowledge base. This approach provides supplementary information to the AI before it generates a response, acting as an essential fact-checking tool.
The demand for more dependable AI has led to the emergence of companies built on RAG technology, such as Voyage AI. Founded by Stanford professor Tengyu Ma in 2023, Voyage empowers RAG systems for a host of companies, including Harvey, Vanta, Replit, and SK Telecom.
“Voyage is committed to improving search and retrieval accuracy in enterprise AI,” Ma shared in an interview. “Our solutions are tailored for specific domains, including coding, finance, legal issues, and multilingual applications, aligning closely with each company’s unique data.”
Voyage develops RAG systems by training AI models to convert various types of content—such as text, documents, and PDFs—into numerical representations known as vector embeddings. These embeddings efficiently encapsulate the meanings and relationships between diverse data points, making them particularly useful for applications that require accurate search capabilities.
Voyage employs a specialized form of embedding called contextual embedding, which captures both the semantic meaning of information and its context. For instance, the term “bank” would generate distinct vectors when present in the sentences “I sat on the bank of the river” versus “I deposited money in the bank,” reflecting the varying implications based on context.
Voyage offers its models for on-premises, private cloud, and public cloud usage, and fine-tunes its offerings for clients who opt in for this service. While many companies, including OpenAI, provide customizable embedding solutions, Ma asserts that Voyage achieves superior performance at lower costs.
“In RAG, when processing a question or query, we first retrieve pertinent information from a disorganized knowledge base—similar to how a librarian would search for books in a library,” Ma clarified. “Traditional RAG techniques often falter with context loss during encoding, which can impede effective information retrieval. Voyage’s embedding models offer best-in-class retrieval accuracy, enhancing the overall response quality of RAG systems.”
Adding weight to these claims is the endorsement from Anthropic, a competitor of OpenAI, which describes Voyage’s models as "state of the art." “Voyage’s method utilizes vector embeddings tailored to company data, resulting in context-aware retrievals that considerably boost accuracy,” Ma noted.
Currently, Voyage, located in Palo Alto, has more than 250 clients, though Ma did not disclose specific revenue figures. In September, the company secured a $20 million Series A funding round led by CRV, with contributions from Wing VC, Conviction, Snowflake, and Databricks. This capital injection, which raises Voyage's total funding to $28 million, is set to facilitate the launch of new embedding models and to support a planned expansion of the team.