OpenAI Adopts a 'Deliberate Approach' for Releasing Tools to Detect ChatGPT-Generated Content

OpenAI has developed a tool aimed at identifying students who might resort to cheating by using ChatGPT to complete their assignments. However, as reported by The Wall Street Journal, the company is currently contemplating whether to make this tool publicly available.

In a statement, an OpenAI spokesperson confirmed that the company is exploring a text watermarking technique mentioned in the Journal’s article. They emphasized a "deliberate approach" due to the complexities involved and the potential impact this technology may have on the wider ecosystem beyond OpenAI itself.

“The development of our text watermarking method is technically promising, yet it also presents significant risks," the spokesperson noted. "We are currently assessing these risks while considering alternatives, including the method's vulnerability to evasion by malicious users and its possible disproportionate effects on certain groups, such as non-English speakers.”

This proposed method marks a shift from previous attempts to detect AI-generated text, which have generally fallen short of being reliable. Notably, OpenAI discontinued its earlier AI text detector last year citing “low accuracy.”

Through text watermarking, OpenAI aims to specifically identify content generated by ChatGPT rather than models from other companies. This would involve implementing minor adjustments to how ChatGPT chooses words, thereby embedding an invisible watermark within the text that could be identified using a separate detection tool.

In response to the Wall Street Journal’s coverage, OpenAI also revised a May blog post discussing its ongoing research into identifying AI-generated materials. The update indicated that text watermarking has demonstrated “high accuracy” and is effective against localized alterations, such as paraphrasing. However, it noted that the method is “less robust” against broader manipulations, like employing translation systems, rephrasing with alternative generative models, or obscuring text with special characters.

Consequently, OpenAI’s findings suggest that this approach is “easily circumvented by malicious actors.” Additionally, the update reiterated the spokesperson's concern regarding non-English speakers, highlighting that text watermarking might “stigmatize the use of AI as a valuable writing tool for individuals who are non-native English users.”

Most people like

Find AI tools in YBX