Activeloop Raises $11 Million to Enhance AI Database Solutions
California-based startup Activeloop has secured $11 million in Series A funding from investors including Streamlined Ventures, Y Combinator, and Samsung Next. The company, co-founded by Princeton dropout Davit Buniatyan, specializes in a dedicated database designed to expedite AI project development.
Activeloop stands out in a crowded data platform market by addressing a critical challenge for enterprises: leveraging unstructured multimodal data to train AI models. Their innovative technology, "Deep Lake," enables teams to develop AI applications at costs up to 75% lower than competing solutions while boosting engineering productivity by as much as five times.
Unlocking the Potential of AI with Deep Lake
As businesses seek to harness complex datasets for diverse AI applications, McKinsey research highlights the lucrative potential of generative AI, which could generate between $2.6 trillion and $4.4 trillion in global corporate profits annually. This impact spans various domains, including customer interactions, marketing content creation, and software code generation from natural language prompts.
How Activeloop Deep Lake Works
Training high-performance foundation AI models often involves managing petabyte-scale unstructured data across text, audio, and video. Traditional methods require teams to sift through disorganized data silos, necessitating extensive boilerplate coding and integration efforts—escalating project costs.
Activeloop addresses this inefficiency through the standardization of Deep Lake. This system stores complex data, such as images and videos, as machine learning-native mathematical representations (tensors) and enables seamless retrieval via a SQL-like Tensor Query Language, in-browser visualization, or integration with deep learning frameworks like PyTorch and TensorFlow.
With Deep Lake, developers can efficiently filter and search multimodal data, track versions, and stream data for training AI models tailored to specific applications.
Transforming Data Management in AI
Buniatyan emphasizes that Deep Lake combines the advantages of a vanilla data lake while converting all data into the tensor format that deep learning algorithms require. Tensors are stored in cloud or local solutions, such as AWS S3, and streamed to GPUs for efficient training—eliminating the batch copying methods that previously caused idle GPU time.
Since its inception in 2018, driven by Buniatyan's challenges at the Princeton Neuroscience Lab, Activeloop has developed comprehensive database functionalities with both open-source and proprietary elements. The open-source aspect includes dataset formats, version control, and various APIs for streamlined data handling. Meanwhile, proprietary features offer advanced visualization tools and a robust streaming engine.
While specific customer numbers remain undisclosed, the open-source project has been downloaded over one million times, bolstering Activeloop's foothold in enterprise markets. The enterprise offering operates on a usage-based pricing model and is already utilized by Fortune 500 companies in regulated sectors like biopharma, life sciences, medtech, automotive, and legal.
For instance, Bayer Radiology has implemented Deep Lake to consolidate various data modalities into a single solution, significantly reducing data pre-processing time while introducing a "chat with X-rays" feature that enables data scientists to query scans using natural language.
Future Plans for Growth
Activeloop aims to enhance its enterprise solutions and attract additional clients to its AI database, focusing on simplifying the organization and retrieval of complex unstructured data. The company plans to expand its engineering team, fueled by the recent funding.
Buniatyan also anticipates the upcoming launch of Deep Lake v4, which will introduce faster concurrent I/O, an advanced streaming data loader for model training, and comprehensive data lineage capabilities, alongside integration with external data sources. He emphasizes that while many customers exist in this space, no direct competitors have emerged.
Ultimately, Activeloop aspires to save enterprises substantial costs associated with in-house data organization and retrieval, enabling engineers to focus on productivity rather than repetitive coding tasks.