Rabbit Develops AI Model to Comprehend Software Functionality

Introducing Rabbit: Natural Language Software Control Powered by AI

Imagine interacting with any software using natural language—typing a simple prompt and having AI convert it into machine-readable commands that execute tasks on your PC or phone. This is the vision behind Rabbit, formerly known as Cyber Manufacture Co., which is developing a custom AI-driven user interface layer that connects users with any operating system seamlessly.

Co-founded by Jesse Lyu, who earned his mathematics degree from the University of Liverpool, along with Alexander Liao, a former researcher at Carnegie Mellon, Rabbit is building Rabbit OS. This innovative platform aims to allow AI to see and interact with desktop and mobile interfaces just like humans do.

“The rapid advancements in generative AI have sparked various initiatives in the tech industry, aiming to redefine human-machine interaction,” Lyu shared in a recent email interview. “We believe that the key to success lies in offering an outstanding end-user experience. Through our past experiences, we’ve learned that transforming user experience requires a tailored platform. This principle is at the heart of Rabbit's current product and technology stack.”

Rabbit has secured $20 million in funding from investors such as Khosla Ventures, Synergis Capital, and Kakao Investment, with a valuation estimated between $100 million and $150 million. While Rabbit is not the first to layer natural language interfaces on existing software, its approach stands out.

For instance, Google’s AI lab, DeepMind, has delved into methods for teaching AI to operate computers by observing human input in tasks such as flight booking. Similarly, a research team from Shanghai Jiao Tong University has released a web-navigating AI agent that claims to master online tasks. Additionally, viral applications like Auto-GPT utilize OpenAI's models to operate software autonomously, engaging with both local and online applications.

However, Rabbit’s primary competitor appears to be Adept, which is training a model called ACT-1 designed to execute commands like “generate a monthly compliance report” or “draw stairs in a blueprint” across various software, including Airtable and Photoshop. Co-founded by engineers from DeepMind, OpenAI, and Google, Adept has raised substantial investments from leading firms, including Microsoft and Nvidia, boasting a valuation around $1 billion.

So, how does Rabbit plan to stand out in this competitive landscape? Lyu emphasizes its unique technical approach.

While Rabbit might seem similar to robotic process automation (RPA), which automates repetitive tasks using software robots, Lyu asserts that their technology is more advanced. Rabbit’s core interaction model is designed to “understand complex user intentions” and navigate user interfaces effectively, claiming to potentially “grasp human intentions on computers.”

“The model can already engage with many major consumer applications—such as Uber, DoorDash, Expedia, and Spotify—across Android and web platforms,” Lyu noted. “Next year, we aim to extend this functionality to all platforms (Windows, Linux, MacOS) and smaller consumer apps.”

The Rabbit model can carry out tasks like booking flights or making restaurant reservations, and even edit images in Photoshop with appropriate built-in tools. However, during a demo on Rabbit’s website, the model exhibited limited functionality. When tasked with photo editing, it asked for an image specification—impossible since the demo lacked an upload option.

The model effectively answers general inquiries that might involve searching the broader web, similar to ChatGPT with web access. For example, when asked for the cheapest flights from New York to San Francisco on October 5, it provided a plausible response within 20 seconds.

In terms of sensitive topics, the Rabbit model refrained from engaging with inappropriate prompts regarding making dangerous items or questioning significant historical events. This indicates that the team has learned from the shortcomings of previous language models, as noted by my brief observations.

“By leveraging our model, the Rabbit platform empowers any user, regardless of expertise, to teach the system how to accomplish specific goals on applications,” Lyu explained. “The model continuously learns from aggregated demonstrations and internet data, creating a ‘conceptual blueprint’ for any application service.”

Rabbit’s model shows resilience to interface variations or changes, working by simply observing a human using a software interface via screen recording, at least once.

However, the extent of the Rabbit model's robustness is still unclear. The team itself acknowledges this uncertainty, reflecting the potential complexities of navigating numerous user interfaces across devices. This is why, in addition to model development, the company is creating a framework for testing and refining the model to ensure its functionality in real-world scenarios.

As part of its vision, Rabbit also plans to introduce dedicated hardware to support its platform. While this could be an ambitious undertaking, considering the challenges in hardware scaling and the possible consumer resistance to vendor lock-in, Lyu emphasizes that they are crafting a novel and affordable mobile device specifically designed for natural language interactions.

“We're developing an intuitive mobile device that will access our platform," Lyu said. "This unique form factor allows us to create innovative interaction patterns that existing platforms may not support."

Beyond hardware, Rabbit faces challenges in scaling its platform. A thorough model like Rabbit’s requires extensive example data of successfully completed tasks, which can demand considerable resource investment. Past studies have shown that gathering training data can be an arduous and expensive endeavor.

Despite its current funding, which supports a lean team of nine people operating from Lyu's home, questions arise about Rabbit's ability to compete with established entities in the sector and the emergence of new threats such as Microsoft’s Copilot and OpenAI’s ChatGPT ecosystem.

Nevertheless, Rabbit's ambitions remain high, with plans to generate sustained revenue through licensing and continuously improving its model.

“We have yet to release our product, but our early demos have drawn significant interest," Lyu said. "As we evolve our models, we will collect valuable data and follow rigorous evaluation benchmarks. The Rabbit team is focused on embedding cutting-edge research into robust and secure systems for rapid deployment.”

Most people like

Find AI tools in YBX