Developers often find themselves bogged down by mundane tasks, spending far less time actually coding. According to Stack Overflow’s 2022 developer survey, a staggering 63% of developers reported dedicating over 30 minutes daily to search for solutions to problems. This translates to an estimated loss of 333 to 651 hours per week for a team of 50 developers. Additionally, a survey conducted by Propeller Insights and Rollbar revealed that more than a third of developers spend around 25% of their time fixing bugs, with 26% allocating up to half their time on this task.
This frustrating trend led William Zeng and Kevin Lu—both seasoned professionals from Roblox, known for its transformative gaming platform—to develop Sweep, a solution designed to automate routine development tasks, such as high-level debugging. “We conceived Sweep after our time at Roblox, where we frequently tackled software chores that could be automated using AI,” explained Zeng, Sweep’s CEO, in an email interview. “Sweep functions like an AI-powered junior developer for software teams.”
Previously featured during Y Combinator’s Summer 2023 Demo Day, Sweep has recently secured an additional $2 million in funding from Goat Capital, Replit CEO Amjad Masad, Replit VP of AI Michele Catasta, and Exceptional Capital, valuing the startup at $25 million post-money.
With Sweep, developers can issue commands in natural language—such as “add debug logs to my data pipeline”—without needing an Integrated Development Environment (IDE), allowing the platform to generate the corresponding code. It can then submit this code as a pull request to the correct codebase and respond to any comments made by code maintainers or owners. This approach offers increased autonomy compared to similar tools, like GitHub Copilot.
“Sweep enables engineers to deliver results faster,” Zeng stated. “We take care of technical debt that accumulates with each code change by enhancing error logs, adding unit tests, and refactoring inefficient code.”
Sweep specializes in writing Python code, employing a mix of AI models for code generation. While it utilizes OpenAI’s GPT-4, Zeng highlighted the importance of a custom “code search engine” that has not been trained on any customer data. This engine allows for comprehensive code changes across entire repositories. “Our proprietary Python code search engine combines lexical and vector search techniques,” Zeng explained. Lexical search identifies literal code matches or slight variations, while vector search finds loosely related code based on shared characteristics. “We boast exceptional unit test generation capabilities and can execute tests in real time,” he added.
Looking ahead, Sweep plans to enhance its code generation features using StarCoder, an open-source model from Hugging Face and ServiceNow.
Despite the promise of AI, there are concerns about Sweep’s long-term reliability. A research team associated with Stanford found that developers using AI tools are more likely to introduce security vulnerabilities due to the superficial correctness of AI-generated code, which may conceal security flaws.
Copyright issues also pose a risk. Certain code-generating models—though not necessarily StarCoder or Sweep’s own—are trained on copyrighted or restrictively licensed code. This raises potential legal risks for companies that might inadvertently integrate such code into their software. To mitigate this, Sweep encourages users to thoroughly review and edit any generated code before finalizing changes to the master codebase.
“The primary challenges facing AI developer tools revolve around reliability and managing large codebases,” Zeng acknowledged. “We draw upon our expertise in both legacy and current methodologies to ensure Sweep remains robust.”
Sweep's pricing reflects its premium services, charging $480 per seat per month, especially in comparison to alternatives like GitHub Copilot or Amazon CodeWhisperer, which offer business-focused tiers around $20 per user monthly. Nonetheless, Zeng asserts that despite a modest funding of $2.8 million, Sweep's clientele provides sufficient revenue to sustain operations for years to come. “The new funding will allow us to expand our team from two to five employees over the next year,” he continued. “We will continue to focus on Python while enhancing our approach to tackling technical debt through unit testing, code refactoring, and resolving outstanding tasks in the code.”