Meta Halts AI Training with European User Data Amid Regulatory Pressure

Meta has announced a pause in its plans to train artificial intelligence systems using data from users in the European Union and the U.K. This decision comes after significant pushback from the Irish Data Protection Commission (DPC), the lead regulator for Meta in the EU, which represents multiple data protection authorities across the region. The U.K.’s Information Commissioner’s Office (ICO) has also called for a halt to Meta’s initiatives until their concerns are adequately addressed.

“The DPC welcomes Meta's decision to pause its plans to train its large language model with public content shared by adults on Facebook and Instagram across the EU/EEA,” stated the DPC on Friday. “This decision follows extensive discussions between the DPC and Meta. The DPC, in collaboration with its fellow EU data protection authorities, will continue to engage with Meta regarding this matter.”

While Meta currently utilizes user-generated content for AI training in markets like the U.S., the strict regulations of the General Data Protection Regulation (GDPR) in Europe have created hurdles for Meta and other companies striving to enhance their AI systems. Last month, Meta began notifying users of a forthcoming change to its privacy policy that would grant it the right to use public content from Facebook and Instagram to train its AI models, including content from comments, interactions with businesses, status updates, and associated photos and captions. Meta argued that these changes would help reflect the “diverse languages, geography, and cultural references of the people in Europe.”

This policy update was scheduled to take effect on June 26—just 12 days away—triggering privacy activist organization NOYB (“None of Your Business”) to file 11 complaints with various EU authorities, claiming that Meta's actions violate several aspects of GDPR. One major concern involves the opt-in versus opt-out process; NOYB asserts that users should be asked for permission prior to processing their personal data, rather than requiring them to take action to refuse.

In its defense, Meta cited a GDPR clause known as "legitimate interests," arguing that its actions were compliant with regulations. This isn't the first time Meta has leaned on this legal justification; previously, it invoked legitimate interest to support targeted advertising for European users. However, the Court of Justice of the European Union (CJEU) ruled that legitimate interests could not be used as justification in that context, which could indicate potential challenges for Meta in this instance as well.

It was anticipated that regulators would at least pause Meta’s changes, especially given the complications the company created for users trying to opt out of data usage. Meta claimed to have sent more than 2 billion notifications regarding the policy changes. However, unlike critical public announcements such as voter prompts, these notifications appeared among regular alerts about friends' birthdays or photo tags, making them easy to overlook for users who do not frequently check their notifications.

For those who did notice the notification, understanding how to object or opt out was not straightforward. The notification simply invited users to click through for information on how Meta would utilize their data, offering no clear indication of the option to refuse.

Users technically could not simply opt out of having their data used. Instead, they needed to fill out an objection form, explaining why they did not want their data processed. Meta asserted it would review each request, but the company retained discretion over the final decision.

While the objection form was accessible from the notification, finding it through account settings required a series of steps that were not intuitive.

To file an objection on Facebook’s website, users had to navigate through several layers: clicking their profile photo at the top-right, selecting settings, tapping privacy center, scrolling to the Generative AI at Meta section, and finally navigating past a series of links to find the discrete “right to object” form buried within approximately 1,100 words of text. The process was similarly convoluted in the Facebook mobile app.

When asked why the company required users to file objections rather than allowing for simpler opt-in options, Meta’s policy communications manager, Matt Pollard, pointed to an existing blog post stating: “We believe this legal basis [‘legitimate interest’] is the most appropriate balance for processing public data on the scale necessary to train AI models while respecting people’s rights.” Essentially, Meta suggested that an opt-in framework would not yield sufficient data “scale,” leading to the decision to issue a single notification buried among others and requiring users to navigate multiple clicks to object.

In a recent blog update, Meta's global engagement director for privacy policy, Stefano Fratta, expressed disappointment over the DPC's request. “This is a setback for European innovation and AI development, delaying the benefits AI can bring to people in Europe,” Fratta stated, asserting confidence in the compliance of their approach with European laws and regulatory standards.

The ongoing scrutiny of Meta reflects larger issues within the tech industry's AI arms race, revealing the extensive data pools held by major tech companies. Earlier this year, Reddit disclosed plans to earn over $200 million by licensing its data to firms like OpenAI and Google, with the latter facing significant fines for using copyrighted news to train its generative AI models.

These developments underscore the lengths companies will go to leverage data within existing legal frameworks; initiatives to simplify the opt-out process are often lacking. A notable case involved Slack's privacy policy, which indicated that users could only opt out of AI training by emailing the company.

Google recently offered online publishers a way to opt out of having their sites utilized in AI training by allowing them to insert specific code. OpenAI is also developing a tool for content creators to opt out of training its generative AI, scheduled for release in 2025.

Although Meta's current efforts to use public content for AI training in Europe are paused, they may resurface in a revised form following further discussions with the DPC and ICO. Stephen Almond, the ICO’s executive director for regulatory risk, stated, “For generative AI to reach its full potential, it is essential that the public can trust that their privacy rights will be upheld from the very beginning.” He affirmed that the ICO would continue to monitor significant generative AI developers, including Meta, to ensure that the rights of U.K. users are protected.

Most people like

Find AI tools in YBX