Warning from YouTube CEO to OpenAI: Risks of Violating Guidelines in Training AI Models with Video Content

In a recent media interview, YouTube CEO Neal Mohan addressed concerns surrounding the training data used by OpenAI's video generation AI model, Sora. He stated that while there is no direct evidence that OpenAI has used YouTube videos for training their models, doing so would violate YouTube's terms of service.

Mohan emphasized that content creators retain specific rights when they upload videos, including reasonable use and protection of their content. According to YouTube's service agreement, downloading and using video segments for AI training without authorization is explicitly prohibited, as such actions undermine the trust between creators and the platform.

Interestingly, while Mohan expressed concerns about OpenAI, he acknowledged that YouTube's parent company, Google, had used YouTube content in training its own AI model, Gemini, but clarified that they secured permission from creators and adhered to relevant contracts before utilization. This suggests that OpenAI may not have followed the same authorization processes for data usage.

OpenAI has been ambiguous regarding the sources of the training data for the Sora model. Mira Murati, the company's Chief Technology Officer, did not confirm whether YouTube videos were used, only indicating that legally permissible publicly available videos might be included in the training set, albeit without certainty.

This situation has sparked widespread debate over the compliance of data usage in AI model training. As AI technologies evolve, ensuring data legality, respecting creator rights, and adhering to regulatory and industry standards has become a focal point within the industry.

As one of the world's largest video platforms, YouTube's stance on content copyright and data usage is significant. Mohan's statements send a clear message: unauthorized use of YouTube videos for AI model training will face strict repercussions.

Consequently, OpenAI must carefully consider compliance issues related to training data. This serves as a reminder to other AI companies and research institutions to respect data copyrights and privacy when utilizing public data for model training, ensuring sustainable and responsible AI development.

Most people like

Find AI tools in YBX