A recent report from the World Privacy Forum has revealed critical insights into AI governance tools used by governments and multilateral organizations. The review of 18 such tools found that over one-third (38%) contained "faulty fixes." These tools, designed to evaluate AI systems for fairness and explainability, often lacked essential quality assurance mechanisms or employed measurement methods deemed unsuitable outside their original contexts.
Many of these tools were developed by major tech companies like Microsoft, IBM, and Google—entities that also create the AI systems they assess. For example, the report highlights IBM's AI Fairness 360 tool, which the US Government Accountability Office has endorsed for incorporating ethical principles in AI. However, the foundational research behind its "Disparate Impact Remover algorithm" has faced strong criticism in academic circles.
Pam Dixon, founder and executive director of the World Privacy Forum, stated, "Most AI governance tools currently in use are operating without proper standards." She noted a significant issue is the absence of established quality assurance requirements. Tools intended for one specific context may be misapplied, resulting in "off-label" uses that can lead to unforeseen consequences.
The report defines AI governance tools as mechanisms to assess and measure aspects of AI systems including inclusivity, fairness, explainability, privacy, and safety. While these tools may offer reassurance to regulators and the public, they can inadvertently foster a false sense of security, trigger unintended issues, and undermine the potential of AI technologies.
With the recent passage of the EU AI Act and President Biden’s AI Executive Order, there exists a timely opportunity to enhance the AI governance landscape, as noted by Kate Kaye, deputy director of the World Privacy Forum. "Although we've identified some flaws, there's significant potential for improvement in the AI governance ecosystem," she remarked. "These tools represent how governments are enacting AI policies and will play crucial roles in implementing future laws and regulations."
Kaye also shared an example of how good intentions can lead to poor outcomes: the four-fifths or 80% rule in US employment law is being misapplied in AI governance tools globally. This rule assesses whether selection processes result in adverse impacts on specific groups, yet it has been abstracted to contexts unrelated to employment, leading to inappropriate applications.
Amid pressure to establish AI regulations, Kaye warned against embedding flawed methodologies into policy. She emphasized the risk of perpetuating existing problems through rushed implementations.
Looking ahead to 2024, both Dixon and Kaye are optimistic about advancements in AI governance tools. Dixon noted that the Organization for Economic Cooperation and Development (OECD) is eager to collaborate to improve these tools, signaling a positive direction. The National Institute of Standards and Technology (NIST) has also expressed interest in developing rigorous standards based on evidence. With focused efforts, they believe meaningful improvements in the AI governance landscape can be achieved within six months.