Cognition recently captured attention with its AI-driven software engineer, Devin, which could autonomously write and edit code using OpenAI’s GPT-4. However, only five months since Devin's launch in March 2024, a new challenger has arrived: Cosine's Genie.
Genie, an autonomous AI engineer developed by the Y Combinator-backed Cosine, claims to outperform Devin with a score of 30% on the SWE-Bench benchmark, significantly higher than Devin's 13.8% and better than Amazon’s 19% for their models Q and Factory's Code Droid.
Cosine's CEO, Alistair Pullen, emphasizes that Genie goes beyond benchmark scores. "This model was specifically trained to think and behave like a human software engineer," he stated on social media.
What is Genie and What Can It Do?
Genie is designed to handle a variety of coding tasks autonomously—from bug fixing to feature building and code validation. It can operate independently or collaborate with users, mimicking the experience of working alongside a skilled colleague. “We aim to create an artificial colleague capable of performing end-to-end programming tasks reliably,” Pullen noted during the announcement of Genie’s capabilities.
Genie supports 15 programming languages, including:
- JavaScript
- Python
- TypeScript
- Java
- C- C++
- Rust
- Swift
- PHP
- Ruby
Pullen explains, “By observing how human engineers work, Genie learns to replicate their processes.” The code generated is stored in users’ GitHub repositories, ensuring that Cosine does not retain any sensitive information.
Genie integrates seamlessly with platforms like Slack, allowing it to communicate with users in a manner akin to a human colleague. It can ask clarifying questions and respond to feedback on pull requests, further enhancing collaboration.
Powered by an Advanced OpenAI Model
Genie utilizes a proprietary variant of OpenAI’s GPT-4o, specifically designed for long context outputs. This model can generate up to 64,000 tokens, offering significant advantages over previous versions that were limited to 4,000 tokens.
With an extensive dataset comprising billions of token combinations curated from real engineering activities, Genie continually improves its performance. “Our training data includes PRs, commits, and issues gathered from open-source repositories,” Pullen commented. The meticulous data pipeline ensures high-quality insights into human problem-solving approaches.
Pricing Structure
Genie will initially offer two pricing tiers:
1. Individual Plan: Competitively priced around $20, this tier features limited capabilities but demonstrates Genie’s potential for individuals and small teams.
2. Enterprise Plan: This comprehensive offering includes unlimited usage and advanced features designed to create an exceptional AI engineering colleague.
Implications and Future Prospects
Genie’s advanced capabilities stand to revolutionize software development by increasing efficiency and allowing engineering teams to focus on strategic objectives. “The ability for an AI to handle complex codebases autonomously can radically change our approach to resource allocation,” said Pullen.
Cosine aims to expand Genie’s functionalities, developing smaller models for basic tasks and larger versions for intricate challenges. Plans to collaborate with open-source communities are also on the horizon.
Next Steps and Availability
While Genie is currently being offered to select users, interested parties can apply for early access through the Cosine website. Cosine is committed to continual improvement, incorporating user feedback to enhance Genie’s capabilities.
In addition, Cosine aims to maintain some proprietary aspects of its methodology while transparently sharing Genie’s outputs on GitHub for independent verification.
About Cosine
Founded in 2022 by Pullen, Sam Stenner, and Yang Li, Cosine is dedicated to applying human reasoning to complex problems in artificial intelligence, starting with software engineering. With $2.5 million in seed funding, Cosine aims to redefine how AI can mimic and innovate human tasks.
"We believe we can translate human reasoning for any industry, starting with software engineering," Pullen affirmed. The launch of Genie is just the beginning of Cosine's ambitious journey.