Today, Cognition, an AI startup supported by Peter Thiel’s Founders Fund alongside tech leaders like former Twitter executive Elad Gil and DoorDash co-founder Tony Xu, unveiled “Devin,” a fully autonomous AI software engineer.
Unlike existing coding assistants such as GitHub Copilot, Devin distinguishes itself by managing entire development projects from start to finish. This includes coding, debugging, and executing projects—all while demonstrating its capabilities on platforms like Upwork.
The launch of Devin represents a pivotal evolution in AI-assisted software development, providing engineers with a comprehensive AI worker rather than a mere tool for writing simple code snippets.
Currently, Devin is not publicly available. Limited access has been granted to a select group of users, including Bloomberg journalist Ashlee Vance, who shared insights about using the software.
What can Devin do?
Cognition CEO Scott Wu detailed Devin's capabilities in a blog post, highlighting its access to essential developer tools—such as a code editor and browser—within a secure, sandboxed environment. Devin can tackle intricate engineering tasks that typically involve making thousands of decisions.
Users simply input natural language prompts into Devin’s chatbot interface, which then devises a step-by-step plan to address the task. Devin autonomously writes code, resolves issues, conducts testing, and delivers progress updates in real-time, enabling users to track the project seamlessly.
If users spot any discrepancies, they can interact directly via the chat interface to issue commands, allowing engineering teams to delegate routine tasks and focus on higher-level, creative work.
Devin exemplifies a transformative future for software development, where AI workers operate under human supervision.
Versatile in handling development tasks
According to Wu's demonstrations, Devin excels in various tasks, including end-to-end app and website deployment, bug identification and resolution, and even advanced projects like fine-tuning large language models linked to research repositories on GitHub.
In one instance, Devin learned from a blog post to produce images with hidden messages, while in another, it successfully managed an Upwork project involving computer vision model development.
In the SWE-bench test—an assessment using real-world open-source GitHub issues—Devin resolved 13.86% of challenges autonomously. In comparison, Claude 2 resolved 4.80%, while SWE-Llama-13b and GPT-4 solved 3.97% and 1.74%, respectively, all requiring human guidance.
Core technology remains undisclosed
AI’s presence in software development is not new; tools like GitHub Copilot, StarCoder, and Codeium have long been available. However, most focus on augmenting coding rather than independently managing entire projects. Devin by Cognition takes a significant leap forward by functioning as a fully autonomous AI engineer.
Though still undergoing testing, Devin’s ability to navigate multifaceted engineering projects autonomously sets it apart. Cognition has not disclosed whether it employs a proprietary model or a third-party solution but emphasizes advancements in long-term reasoning and planning as key to its functionality.
The company is currently expanding its capacity and extending early access to select users. Interested parties looking to enhance their engineering capabilities can reach out via email, with broader access anticipated in the future.
Cognition hints that coding is “just the beginning,” suggesting plans to develop similar AI agents across other fields. So far, the company has secured $21 million in funding.