OpenAI-Backed Ghost Claims LLMs Will Address Self-Driving Challenges—Experts Remain Skeptical

The self-driving car industry is indeed at a pivotal moment. Recently, Cruise recalled its entire fleet of autonomous vehicles following a tragic incident involving a pedestrian, prompting the California DMV to halt its driverless robotaxi operations. Meanwhile, activists in San Francisco have taken to the streets, blocking driverless cars in protest against the city’s role as a testing ground for this emerging technology.

In this climate of uncertainty, one startup believes it has the solution to enhance safety in autonomous driving: Ghost Autonomy. This company specializes in developing autonomous driving software for automotive partners and has announced plans to explore multimodal large language models (LLMs)—AI systems that can interpret both text and images—in the self-driving sector. Ghost Autonomy has partnered with OpenAI through the OpenAI Startup Fund, granting them early access to OpenAI technologies and Azure resources from Microsoft, as well as a $5 million investment.

“LLMs offer a fresh approach to understanding complex scenarios where current models struggle,” said John Hayes, co-founder and CEO of Ghost, in an email interview. “The potential applications for LLM-based analysis in autonomous driving will expand as these models evolve and become more efficient.”

But how is Ghost leveraging AI models designed for text and image processing to enhance the control of autonomous vehicles? According to Hayes, Ghost is piloting software that utilizes multimodal models to “interpret complex scenes,” suggesting road maneuvers (like “shift to the right lane”) to the vehicle's control systems based on images captured by on-board cameras.

“At Ghost, we aim to refine existing models and develop our own to maximize road reliability and performance,” Hayes explained. “For instance, construction zones present unique challenges such as temporary lanes and flagmen, which can complicate navigation for simpler models. LLMs have demonstrated an ability to consider all these elements with human-like reasoning.”

While Ghost advocates for the application of LLMs in self-driving technology, experts express skepticism. “Ghost is using ‘LLM’ as a marketing term," noted Os Keyes, a Ph.D. candidate at the University of Washington studying law and data ethics, in an email. “If you swapped out LLM for ‘blockchain’ and revisited 2016, the scenario would be equally plausible—and just as dubious.”

Keyes argues that LLMs may not be the right fit for self-driving challenges, as they were not specifically formulated for this purpose. He likened the situation to using treasury bonds to prop up a table—possible, but impractical.

Mike Cook, a senior lecturer at King’s College London researching computational creativity, echoed this sentiment. He emphasized that multimodal models are not yet foolproof; OpenAI’s leading model sometimes generates inaccuracies that a human would avoid. “I don’t believe there’s a silver bullet,” Cook stated. “Putting LLMs at the core of a complex and risky task like driving is questionable. Researchers are already wrestling with ensuring the safety of LLMs for simpler applications, making their use for autonomous driving seem premature at best, and misguided at worst.”

Nevertheless, Hayes and the team at OpenAI remain optimistic. Brad Lightcap, OpenAI’s COO and manager of the OpenAI Startup Fund, remarked that multimodal models could broaden LLM applications across various fields, including autonomous driving. He stated: “By understanding and synthesizing video, images, and sound, multimodal models could offer novel insights into navigating intricate environments.”

As for Hayes, he envisions LLMs helping autonomous systems "reason through driving scenarios holistically" and leverage extensive world knowledge to tackle complex situations, even those previously unseen. He asserted that Ghost is actively testing multimodal decision-making with its development fleet and collaborating with automakers to validate and integrate new large models into their autonomy framework.

“No question, the current models need refinement before they can be commercially viable,” Hayes admitted. “That’s precisely why companies focused on application-specific R&D, like ours, are crucial. We have access to a wealth of training data and deep industry knowledge that will enhance these general models. Ultimately, achieving true safety in autonomous driving will require a comprehensive system, integrating various model types and functions, with multimodal models being one valuable tool.”

Despite these ambitious claims, the challenge remains significant. With established companies like Cruise and Waymo facing considerable hurdles even after years of development, the question lingers: Can Ghost deliver on its promises? The journey ahead looks challenging.

Most people like

Find AI tools in YBX