Data is the cornerstone of AI innovation. Organizations, from agile startups to multinational corporations, are investing billions to harness datasets for high-performing AI applications.
However, despite these substantial investments, accessing and utilizing data from diverse sources and modalities—such as text, video, and audio—remains a complex challenge. Teams face numerous integration hurdles, resulting in delays and lost business opportunities.
ApertureData, a California-based startup, aims to address this challenge with its unified data layer, ApertureDB. This innovative solution combines the strengths of graph and vector databases along with multimodal data management, enabling AI and data teams to accelerate their application deployment. Recently, ApertureData announced $8.25 million in seed funding and the launch of a cloud-native version of their graph-vector database.
“ApertureDB can reduce data infrastructure and dataset preparation times by 6 to 12 months, providing immense value to CTOs and CDOs who must develop effective AI strategies in a rapidly changing environment with conflicting data requirements,” said Vishakha Gupta, founder and CEO of ApertureData. She emphasized that this offering can enhance the productivity of data science and machine learning teams by an average of tenfold in multimodal AI development.
What Makes ApertureData Stand Out?
Many organizations struggle to manage the increasing influx of multimodal data—terabytes of text, images, audio, and video— hindering their ability to leverage AI effectively. The challenge lies not in data scarcity, but in the fragmented tool ecosystem needed to process it for advanced AI applications.
Currently, teams must gather data from various sources, store it in cloud repositories, and deal with continuously changing metadata in files or databases. This process often requires writing custom scripts for data retrieval and preprocessing. Once initial tasks are completed, teams must integrate graph databases and vector search functionalities to implement desired AI experiences, leading to significant delays.
“Enterprises expect their data layer to facilitate the management of diverse data modalities, streamline ML preparation, and support dataset management, annotations, model tracking, and advanced data search and visualization. Unfortunately, they often resort to manually integrated solutions involving various cloud storage systems, databases, and processing libraries, which complicates the workflow and delays project timelines,” Gupta explained, who recognized this issue while working with vision data at Intel.
To address this, Gupta partnered with Luis Remis, also a research scientist at Intel Labs, to create a comprehensive data layer that addresses all multimodal AI data tasks in one platform.
ApertureDB now enables enterprises to centralize datasets—large images, videos, documents, embeddings, and their metadata— for efficient retrieval and querying. It provides a unified view of the schema and includes knowledge graph and vector search capabilities for diverse AI applications, from chatbots to search systems.
“Through extensive conversations, we learned the need for a database that comprehensively understands both multimodal data management and AI requirements, making adoption and production deployment effortless. That’s precisely what we have achieved with ApertureDB,” Gupta remarked.
How Does ApertureDB Compare to Existing Solutions?
While many AI-focused databases are available, ApertureData seeks to carve out a niche by offering a unified product that natively handles multimodal data and seamlessly integrates knowledge graphs with rapid multimodal vector search for AI applications. Users can easily explore relationships among datasets and employ preferred AI frameworks for specific applications.
“Our primary competition is in-house data platforms that rely on a mix of tools, such as relational or graph databases, cloud storage, and in-house scripts. Typically, we replace solutions like Postgres, Weaviate, Qdrant, Milvus, Pinecone, MongoDB, or Neo4j, particularly in multimodal and generative AI contexts,” Gupta emphasized.
ApertureData claims its database can boost the productivity of data science and AI teams by an average of 10 times. It is reported to mobilize multimodal datasets up to 35 times faster than traditional solutions and delivers vector search and classification performance that is 2 to 4 times quicker than existing open-source vector databases.
While Gupta refrained from disclosing specific customer identities, she noted that they have established deployments with select Fortune 100 companies, including a leading home furnishing retailer, a large manufacturer, and various biotech and emerging generative AI startups.
“Across our deployments, the feedback from customers highlights significant gains in productivity, scalability, and performance,” she added, noting that the company has saved one customer $2 million.
Looking ahead, ApertureData plans to expand its cloud platform to accommodate new classes of AI applications, enhance ecosystem integrations for a seamless user experience, and extend partnerships for broader deployments.