Fairly Trained Launches Certification for Ethical AI: Ensuring Generative Tools Use Licensed Data

It is often referred to as the “original sin” of generative AI: many leading models from companies such as OpenAI and Meta have been trained on data scraped from the internet without prior consent or knowledge from the original creators.

AI companies defending this practice argue it's legally permissible. OpenAI states in a recent blog post, “Training AI models using publicly available internet materials is fair use, supported by long-standing precedents. We believe this principle is fair to creators, necessary for innovators, and critical for US competitiveness.”

This data scraping has a history predating the rise of generative AI, having been utilized in many research databases and commercial products, including popular search engines like Google, which creators rely on for traffic to their projects.

However, opposition is mounting against this practice, with numerous authors and artists suing several AI companies for allegedly infringing copyright by training on their work without explicit consent. Notably, Midjourney and OpenAI are among the companies facing scrutiny.

A new nonprofit organization, “Fairly Trained,” has emerged to advocate for data creators, insisting that explicit consent should be obtained before their work is used in AI training. Co-founded by former Stability AI employee Ed Newton-Rex, Fairly Trained aims to ensure that AI companies respect the rights of creators.

“We believe many consumers and companies would prefer to collaborate with generative AI companies that train on data provided with the consent of its creators,” states the organization’s website.

Newton-Rex emphasizes a path forward for generative AI that honors creators and advocates for a licensing model for training data. "If you work at or know a generative AI company that prioritizes this approach, I hope you’ll consider getting certified,” he shared on social media.

When asked about the common argument from AI proponents that training on publicly available data is similar to how humans learn from observing creative work, Newton-Rex countered:

“This argument is flawed for two reasons. First, AI scales. A single AI can generate vast amounts of output that could replace demand for much of the original content—something no individual human can do. Second, human learning operates within an established social contract; creators have always known that their work could inspire others. They didn’t anticipate AI systems leveraging their creations to generate competing content at scale.”

Newton-Rex advises AI companies that have already trained on publicly available data to transition to a licensing model, obtaining permission from creators. “We are still early in the evolution of generative AI, and there is time to create a mutually beneficial ecosystem for human creators and AI companies,” he noted.

Fairly Trained has introduced a “Licensed Model (L) certification for AI providers” to distinguish between companies that obtain consent for training data and those that do not. The certification process involves an online submission followed by a more in-depth review, with fees based on annual revenue, ranging from $150 to $6,000.

Newton-Rex explained, “We charge fees to cover our costs, and they are low enough not to be prohibitive for generative AI companies.” Several companies, including Beatoven.AI and Soundful, have already received this certification, although Newton-Rex declined to disclose specific fee amounts.

When queried about companies like Adobe and Shutterstock that train AI models using creator works under their terms of service, he stated, “We prefer not to comment on specific models we haven’t certified. If they believe their models meet our certification standards, they are encouraged to apply.”

Advisers to Fairly Trained include Tom Gruber, former chief technologist of Siri, and Maria Pallante, President & CEO of the Association of American Publishers. Supporters include notable organizations like the Association of Independent Music Publishers and Universal Music Group, both of which are involved in lawsuits against AI company Anthropic over copyrighted song lyrics.

When asked whether Fairly Trained was participating in any ongoing lawsuits, Newton-Rex clarified, “No, I’m not involved in any of the lawsuits.” He also confirmed that there are currently no external funding sources for Fairly Trained aside from certification fees.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles