Meta to Utilize Public User Data for AI Training, Offers EU Users Opt-Out Option

Meta has made headlines by announcing its plan to leverage user data and posts for training its AI models, a move that currently allows only EU users to opt-out. The company intends to utilize publicly available content from its major platforms, including Facebook and Instagram, to refine its foundation models. Importantly, private messages and content from users under the age of 18 will be excluded.

The necessity for training on public content arises from Meta’s assertion that without this data, its AI would struggle to accurately comprehend significant regional languages, cultural nuances, and popular social media trends. While there are no specific regulations within the EU explicitly prohibiting the use of user data for AI training purposes, the General Data Protection Regulation (GDPR) mandates that companies must secure explicit consent before accessing personal user data. Furthermore, businesses are required to be transparent about their data usage and should provide users the option to withdraw consent whenever they wish.

Meta cautioned that EU users who choose to opt-out might miss out on AI models that are informed by the rich cultural, social, and historical contributions unique to Europe. This position places Meta in a competitive landscape alongside other major AI developers like OpenAI and Google, although Meta asserts that its approach is more transparent than that of its counterparts. The company claims to have reached out with billions of notifications and emails to inform European users about their opt-out options.

Stefano Fratta, Meta’s global engagement director for privacy policy, emphasized in a blog post the company’s commitment to transparency. He stated, “Our approach is more transparent and offers easier controls than many of our industry counterparts already training their models on similar publicly available information.”

It is worth noting that the opt-out policy was not extended to Llama 3, as the model had concluded its development prior to this announcement. However, the new policy will apply to Meta's upcoming models. Fratta clarified that while models may be trained on publicly shared posts, the intention is not to create a database of personal user information or to identify individuals. Instead, these models aim to recognize patterns such as colloquial expressions and local references.

This announcement of greater transparency comes amidst scrutiny from a nonprofit organization that has lodged 11 complaints regarding Meta’s data handling practices related to AI. The non-profit, Noyb, is urging several EU Member States to expedite procedures available under the GDPR to halt Meta’s use of user data ahead of a critical deadline.

Noyb has raised concerns about Meta’s interpretation of data usage, suggesting that the company is essentially asserting its ability to utilize "any data from any source for any purpose and make it available to anyone in the world," provided it is framed as related to "AI technology." Max Schrems, chairman and founder of Noyb, criticized this stance, arguing that the term “AI technology” is vague and lacks specific legal limits. He emphasized the risks involved, noting that Meta has not clarified the applications of the data, which could range from simple chatbots to highly targeted advertisements or even more serious applications. Additionally, Meta's claims about making user data accessible to third parties raise further concerns regarding privacy and data security.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles