Synthetaic Harnesses AI to Uncover Patterns in Vast Data Sets

Remember the Chinese “spy” balloon from 2023?

If you need a quick reminder: Approximately a year ago, a high-altitude balloon from China drifted across U.S. airspace, largely unnoticed until it was eventually spotted and shot down by the U.S. Air Force. The balloon’s origins proved challenging for civilians to trace—until AI firms like Synthetaic demonstrated the capability to do so using satellite imagery.

The balloon incident turned into a significant product demonstration for Synthetaic, capturing the interest of notable investors such as defense contractor Booz Allen Hamilton.

This week, Synthetaic secured $15 million in a Series B funding round, co-led by Lupa Systems and TitletownTech, a venture capital partnership between the Green Bay Packers and Microsoft. Additional support came from IBM Ventures and Booz Allen Hamilton. With this latest investment, Synthetaic's total funding reaches $32.5 million, which will be channeled into accelerating the commercialization of its computer vision technology and nearly doubling its team to 80 employees by year’s end, according to CEO Corey Jaskolski.

“The volume of image data being produced is skyrocketing, highlighting the growing need for advanced AI solutions to manage and analyze this vast information reservoir,” Jaskolski explained in an email interview. “Generating actionable insights from this massive data pool is a substantial challenge and priority for many sectors, including defense, geospatial analysis, video security, and drone monitoring. Synthetaic’s AI capabilities in unsupervised learning and data analytics strategically position us to navigate this evolving technological landscape.”

Jaskolski, an MIT alumnus and former technology director at National Geographic, is an adventurous spirit. He has scuba dived among icebergs in Antarctica, descended over 12,500 feet underwater to explore the Titanic wreck, led helicopter mapping initiatives on the Nepalese side of Everest, and braved submerged caves to catalog human remains and Ice Age bear skeletons.

So why did a globetrotting adventurer like Jaskolski establish Synthetaic?

The answer is straightforward: he recognized that while AI holds promise for classifying global information, it is often hindered by the necessity for manual data annotation.

“Human labeling is the norm for AI training,” Jaskolski noted. “As AI models expand, they perform better but require more data to train due to an increasing number of internal parameters. Historically, the solution has been to enlist millions of annotators to label data. But what if we didn’t need human-labeled data at all?”

Synthetaic, launched in 2019, offers a groundbreaking tool called Rapid Automatic Image Categorization (RAIC), which automates the analysis of unlabeled datasets, such as satellite images and video footage. Unlike traditional methods where data is labeled by annotators, RAIC simply requires users to input an image, allowing the system to locate other instances of that image within a dataset.

In the case of the Chinese balloon, this innovative approach enabled Synthetaic’s platform to detect the balloon using nothing more than a conceptual sketch of its appearance from space and recent satellite imagery from the balloon’s last known location.

“RAIC allows for the handling of scarce or complex data sets, accelerating AI development and enhancing predictive modeling without the limitations of data quantity or quality,” Jaskolski stated. “This positions RAIC as a strategic asset for fostering innovation, operational efficiency, and competitive advantage, especially in scenarios where data scarcity hampers AI implementation.”

Synthetaic is not alone in the exploration of synthetic data for AI training. Synthesis AI, which raised $17 million in April 2022, is developing synthetic data platforms for various AI applications. Scale AI also initiated a program to enhance real-world datasets with synthetic samples two years ago. Meanwhile, firms like Parallel Domain focus on creating synthetic data for specific applications like autonomous driving.

Gartner predicts that by 2024, 60% of data utilized for AI development and analytics will be synthetically generated. However, experts express concerns that the potential downsides and risks associated with synthetic data are often overlooked.

In a January 2020 study, researchers at Arizona State University demonstrated that an AI trained on images of professors produced highly realistic faces, predominantly white and male, reflecting the biases present in the original dataset.

Despite these risks, Synthetaic's clientele seems undeterred. The startup claims collaboration with the U.S. Air Force to test AI-driven object detection in geospatial data and with The Nature Conservancy to identify bird species once thought to be extinct. Additionally, Synthetaic holds a contract with AFWERX, the Air Force’s research lab, to develop technologies for object labeling, AI modeling, and object detection in satellite images.

Jaskolski believes that RAIC has far-reaching applications, from AI prototyping and drone monitoring to content moderation. Underscoring Synthetaic’s collaboration with CNN to analyze war imagery from Gaza and its partnership with Planet Labs for Earth imaging data analytics, he confidently states that Synthetaic’s business model withstands the tech sector’s downturn and broader economic challenges.

“Synthetaic’s technology represents a transformative approach to AI model development and training, addressing the crucial needs of decision-makers in the tech industry,” Jaskolski concluded. “For executives, Synthetaic’s RAIC enables effective management of scarce or complex datasets, speeding up AI development and refining predictive modeling without being restricted by data size or quality. This makes RAIC an invaluable asset for driving innovation and competitive growth, especially in areas where data scarcity impedes AI adoption."

Most people like

Find AI tools in YBX