OpenAI Develops Tool for Content Creators to ‘Opt Out’ of AI Training Process

Home AI News OpenAI Develops Tool for Content Creators to ‘Opt Out’ of AI Training Process

Updated on October 23 2024

OpenAI is actively developing a new tool designed to give creators greater control over how their content is utilized in the training of generative AI models. Named Media Manager, this innovative tool will enable creators and content owners to identify their works within OpenAI's systems and specify their preferences for the inclusion or exclusion of these materials in AI research and training.

OpenAI aims to roll out Media Manager by 2025 as part of its collaborative efforts with “creators, content owners, and regulators” to establish standards—potentially through the industry steering committee it has recently joined. In a blog post, OpenAI explained, “This requires cutting-edge machine learning research to create a unique tool that will help us identify copyrighted text, images, audio, and video across various sources while reflecting the preferences of creators. Over time, we plan to introduce additional choices and features.”

The introduction of Media Manager appears to be OpenAI's response to increasing scrutiny over its data usage practices, which have relied heavily on scraping publicly available data from the internet. Recently, eight major U.S. newspapers, including the Chicago Tribune, filed lawsuits against OpenAI for intellectual property infringement, alleging that the company improperly used their articles to train generative AI models, which it then commercialized without compensation or proper credit.

Generative AI models, including those developed by OpenAI, rely on vast datasets sourced primarily from public websites. Proponents argue that fair use—a legal doctrine permitting the use of copyrighted works for transformative secondary creations—protects their scraping practices for model training. However, this interpretation is contentious, and critics assert that the use of copyrighted materials should be more rigorously regulated.

In a bid to address these concerns and mitigate potential legal challenges, OpenAI has taken steps to find common ground with content creators. Last year, the organization allowed artists to “opt out” of having their work used in the training datasets for image-generating models by submitting individual images for removal. Additionally, website owners can now specify through the robots.txt protocol whether content on their sites can be scraped for AI training purposes. OpenAI is also actively pursuing licensing agreements with major content providers, including news organizations, stock media libraries, and Q&A platforms like Stack Overflow.

Despite these efforts, some content creators feel that OpenAI has not gone far enough. Artists have expressed frustration with the opt-out process for images, which necessitates submitting each image individually along with a description. Reports indicate that OpenAI pays relatively modest fees for licensed content. Furthermore, as OpenAI acknowledges, their current solutions do not adequately address situations where creators’ works are quoted, remixed, or shared on platforms beyond their control.

In addition to OpenAI's initiatives, several third-party companies are working to create universal tools for provenance and opt-out options in generative AI. Startup Spawning AI, which collaborates with Stability AI and Hugging Face, provides an app that tracks and blocks scraping attempts by identifying bots’ IP addresses. They also maintain a database where artists can register their works to prevent unauthorized training by compliant vendors. Other companies, such as Steg.AI and Imatag, assist creators in establishing ownership of their images through imperceptible watermarks, while the University of Chicago's Nightshade project "poisons" image data to disrupt its utility in AI training.

Overall, these efforts illustrate the ongoing dialogue regarding the ethical use of data in AI training and the need for greater transparency and protection for creators’ rights in the evolving landscape of generative AI.

GitHub Mobile App Now Offers General Availability of Copilot Chat Features

How Machine Learning Uncovers the Secrets of the Sperm Whale 'Alphabet'

Most people like

PixieBrix

86.1K

Discover the ultimate low-code platform designed for creating custom browser modifications and automating tasks effortlessly. Embrace a streamlined approach that empowers users to enhance their browsing experience without extensive programming expertise. Whether you're looking to optimize workflows or tailor your web interactions, this platform offers the perfect solution.

browser extension AI Knowledge Base

FireCut AI

126.5K

Streamline your editing process in Adobe Premiere Pro by automating repetitive tasks. Enhance your workflow efficiency and focus on creativity with automation features designed to save you time and effort. Discover how to elevate your video editing experience today!

AI video editing AI Video Editor

Solo - Free AI Website Creator

30.6K

In today's digital landscape, having an impactful online presence is crucial for businesses of all sizes. An AI website creator simplifies the process of building a professional website, allowing entrepreneurs and companies to create stunning sites effortlessly. By leveraging advanced technology, these tools provide customizable templates and user-friendly interfaces, empowering users to establish their brand identity and enhance customer engagement. Discover how an AI website creator can transform your business strategy and drive online success.

AI website creator AI Website Designer

Vitra AI

Effortlessly translate your creative content into over 75 languages.

Creative Translation Translate

Find AI tools in YBX