OpenAI has developed a system for watermarking text generated by ChatGPT and a detection tool that has been ready for about a year, according to reports. However, internal debate continues regarding its release. While implementing watermarking may demonstrate responsibility, it could also adversely affect the company's profitability.
The watermarking process works by altering how the model predicts the most likely words and phrases that follow given inputs, creating a recognizable pattern. This capability would greatly assist educators in preventing students from submitting AI-generated work. OpenAI's tests indicate that watermarking does not compromise the quality of the chatbot's output. A survey commissioned by the company showed that a significant majority of respondents worldwide support an AI detection tool, with a four-to-one margin in favor.
In a recent blog post, OpenAI confirmed its commitment to watermarking text, stating that their method is highly accurate, boasting a “99.9% effectiveness” rate and resistance to tampering, such as paraphrasing. Nevertheless, they acknowledge that certain rewording techniques could easily bypass detection by malicious users. Additionally, concerns have been raised regarding the potential negative perception of AI tools, particularly among non-native speakers.
Interestingly, nearly 30% of surveyed ChatGPT users indicated they would use the software less frequently if watermarking were implemented. Despite these concerns, some OpenAI employees believe watermarking could still be beneficial. To address user reservations, suggestions have emerged for exploring less controversial methods that remain unproven.
In its latest blog update, OpenAI announced that it is in the early stages of investigating the embedding of metadata. The company emphasized that while it is still too early to assess the effectiveness of this approach, its cryptographically signed nature would prevent false positives.