GitHub Copilot Now Alerts Developers When Its Suggestions Align with Code in Public Repositories

GitHub Copilot Enhances Code Writing with New Features

GitHub Copilot has transformed the way developers craft code. However, it can sometimes generate code similar to what's found in existing public repositories, leading to potential challenges. In 2022, GitHub introduced an option to automatically block suggestions of matching public code. A GitHub spokesperson indicated that this feature activates less than 1% of the time. Nevertheless, developers occasionally seek insight into these code snippets—to utilize them (while adhering to their company's licensing requirements) or to explore the entirety of the library from which the code originated.

To bridge this gap, GitHub has launched a private beta of a code referencing feature for GitHub Copilot, providing developers with enhanced choice. When code referencing is enabled, Copilot will no longer automatically block generated matching code; instead, it will display the code in a sidebar, allowing developers to make informed decisions about its use. This feature is also planned for integration into Copilot Chat in the future.

GitHub previewed this functionality last November, and it has taken some time to finalize its release.

As GitHub CEO Thomas Dohmke noted, while Microsoft, GitHub, and many Copilot enterprise users have relied on the initial blocking feature, it proved to be somewhat limiting. “It gives you little control over whether you want to use that code and how to attribute it back to an open source license. It also misses the opportunity to explore libraries that could be beneficial instead of just synthesizing code,” he explained. “You might inadvertently replicate existing solutions found in open source repositories.”

Dohmke emphasized that this issue often arises with common algorithms, such as sorting, which are widely available across various platforms. Now, developers can choose to reject the code, use it directly—if the library permits—or request Copilot to rewrite it to eliminate any matching elements.

Currently, the feature does not support filtering results by specific licenses, but the GitHub team is proactively gathering feedback to assess demand for this capability.

“We’re empowering users to understand the matches and encouraging exploration or informed decision-making,” Dohmke stated. “This addresses the limitations of the previous solution effectively.”

The code referencing feature is triggered more frequently in scenarios where Copilot has limited context. When extensive context is available from the developer’s existing code, the likelihood of producing matching suggestions diminishes. Conversely, when starting a new project, it becomes more common for matching code to be generated.

At the heart of this feature lies a rapid search engine aimed at minimizing latency to 10-20 milliseconds, allowing for quick identification of matching code and its associated license. Currently, matching code snippets are displayed based on the search engine's findings. In its original announcement last year, GitHub highlighted the intent to allow developers to sort these snippets by repository license, commit date, and other factors, and we anticipate these functionalities to be introduced soon.

Most people like

Find AI tools in YBX