Opt Out: How Your Website Can Avoid Contributing to Google Bard and Future AI Training

Large language models rely on diverse data sources, much of which appears to have been collected without user awareness or consent. Now, you have the option to prevent Google from utilizing your web content to train its Bard AI and any future models it may create.

To do this, simply disallow “User-Agent: Google-Extended” in your site’s robots.txt file, which provides guidelines to automated web crawlers regarding accessible content.

While Google promotes its commitment to ethical AI development, the implications of AI training diverge significantly from traditional web indexing. “We’ve also heard from web publishers that they want greater choice and control over how their content is used for emerging generative AI applications,” notes Danielle Romain, Google’s VP of Trust, as if this revelation were unexpected.

Curiously, the term “train” is omitted from her post, despite its clear relevance as this data serves as foundational material for training machine learning models. Instead, the VP of Trust poses the question of whether you genuinely prefer not to “help improve Bard and Vertex AI generative APIs”—to enhance the accuracy and capabilities of these AI models over time.

This framing suggests that it’s not about Google extracting your content; rather, it’s about whether you’re inclined to assist.

On one hand, presenting this question in this way emphasizes the importance of consent and the value of an informed choice to contribute. However, the reality remains that Bard and its counterparts have already been trained on vast datasets gathered without user consent, undermining the authenticity of this narrative.

The stark truth revealed by Google's actions is that it previously took advantage of unrestricted access to online data, acquired what it needed, and is now retroactively seeking permission in an attempt to project an image of prioritizing ethical data collection and consent. If true, these options would have been available much earlier.

In a related move, Medium announced it would universally block such crawlers until a more refined solution becomes available—a decision echoed by numerous other platforms. Medium hints at the formation of a budding coalition among media outlets to block AI crawlers.

Most people like

Find AI tools in YBX