A budding Dutch startup aims to help businesses accurately extract data from vast arrays of complex documents, prioritizing both accuracy and security. Backed by Google’s Gradient Ventures, Send AI is positioning itself against established players in the document processing sector, such as UiPath, Abbyy, Rossum, and Kofax, by offering a customizable platform that enables companies to tailor AI models to their specific data extraction requirements.
For example, companies in highly regulated fields like insurance must navigate a variety of formats, including PDFs, paper files, and smartphone images taken in different orientations with various backgrounds. This unstructured data can pose challenges even for human reviewers, and a fully automated approach may lead to mistakes in claims processing or reimbursement, resulting in administrative complications.
Conventional document processing solutions often target general document types applicable across multiple industries, which may not cater to specialized needs. In contrast, Send AI allows organizations to train a computer vision model to identify specific documents, alongside a dedicated language model for data extraction and validation. If any uncertainty arises, human reviewers are integrated into the process through a user-friendly web interface.
“This validation can be as simple as confirming an expected number’s format, or as complex as verifying a registration number against a database for accuracy,” explained Send AI founder and CEO Thom Trentelman. “Any discrepancies are flagged for human review.”
Founded in Amsterdam in 2021 as Autopilot, Send AI initially secured a modest $100,000 investment from a university alumni fund. Now, as the company gears up for growth, it has successfully raised €2.2 million ($2.4 million) in a pre-seed funding round co-led by Google’s Gradient Ventures and Keen Venture Partners, with contributions from angel investors originating from companies like DeepMind.
How Send AI Empowers Data Extraction from Documents
Businesses can access Send AI’s cloud-based software through APIs, which streamline the flow of data from documents received via email. After documents are received, Send AI enhances their visual quality and analyzes them using its language models for classification and data extraction.
In terms of its target audience, Trentelman indicates that the company primarily focuses on larger enterprises that face significant challenges with document processing. However, any business managing high volumes of documents could benefit from this innovative technology.
While Send AI competes with numerous existing document-processing tools, it also faces competition from a new wave of startups leveraging advanced large language models (LLMs), like OpenAI's GPT-X, which powers ChatGPT. Although Trentelman acknowledges that these LLMs perform well in contexts requiring subjective evaluations like summarization, he argues that high accuracy is crucial for managing extensive document datasets, and traditional LLMs may fall short.
“You will encounter limitations with these technologies sooner rather than later—large, generic LLMs can be unpredictable, slow, and costly,” noted Trentelman. “At Send AI, we enable customers to create their customized solutions.”
Internally, Send AI operates using smaller, open-source models that customers initially train on a small batch of documents. This approach allows for ongoing improvements as more documents are processed with human corrections.
Pricing at Send AI operates on a credit-based model where customers are charged per processing step. Trentelman explained, “This allows us to distinguish between processing a 50-page PDF and a single text snippet. Our models are efficient, economical, and dependable, enabling us to deploy them tailored to each customer’s needs. This model is particularly effective in regulated sectors like health insurance and government.”
Control Over Data Management
Send AI asserts that its technology is well-suited for highly regulated industries due to the control it offers customers over their data. Although their solution is cloud-based, Trentelman emphasizes that traditional LLMs from providers like OpenAI can blend training data from various clients into a single model, raising concerns about potential data leaks. This has led to a surge of startups promising enhanced privacy in LLM-driven solutions.
To allay these concerns, Send AI employs isolated open-source transformer models for each individual customer. “We utilize a variety of models to accomplish the tasks—while they may seem basic initially, they become robust and accurate once trained on quality data,” stated Trentelman.
While the models and training data reside in Send AI’s cloud, the use of isolated models allows for clear identification of data storage locations, enabling easy deletion on request. According to Trentelman, this capability positions Send AI as a favored option among competitors and reassures data-sensitive companies that on-premise installations are not their only choice.
“Many regulated companies now permit suppliers to utilize public cloud services as long as they adhere to strict compliance regulations,” Trentelman added. “Initially, we often receive inquiries about on-premise deployments, but ultimately, nearly all choose our public cloud solution.”
Currently operating in private beta, Send AI boasts notable clients, including the insurance leader Axa. With a seven-member team, the startup plans to leverage its recent funding to double its workforce throughout the coming year, gearing up for a full-scale commercial launch.