Understanding Explainability: Leveraging Clinical Trial Principles for Enhanced AI Safety Testing

The rise of AI in consumer-focused businesses is accompanied by growing concerns about its long-term governance. The urgency for effective AI regulation is underscored by the Biden administration’s recent executive order, which mandates new protocols for the development and deployment of advanced AI systems.

Today, AI providers and regulators emphasize explainability as a core component of AI governance. This focus allows individuals affected by AI systems to understand and challenge the outcomes produced by these technologies, including potential biases.

While explaining simpler algorithms, such as those used for car loan approvals, can be straightforward, newer AI technologies involve complex algorithms that are often difficult to interpret yet provide significant advantages. For example, OpenAI’s GPT-4, with its extensive dataset and billions of parameters, produces human-like conversations that are transforming numerous industries. Similarly, Google DeepMind’s cancer screening models leverage deep learning to deliver accurate disease detection that can save lives.

These intricate models can obscure the decision-making processes, raising a vital question: Should we forgo these partially explainable but beneficial technologies to avoid uncertainty? Even US lawmakers aiming to regulate AI are recognizing the complexities of explainability, highlighting the need for an outcome-focused approach to AI governance instead of one solely centered on explainability.

Addressing uncertainties surrounding emerging technologies is not new. The medical community has long understood that identifying potential harms is crucial when developing new therapies. This understanding has led to the creation of randomized controlled trials (RCTs) for assessing risks.

In RCTs, participants are divided into treatment and control groups, where the treatment group receives the medical intervention and the control group does not. By comparing outcomes across these comparable cohorts, researchers can ascertain causality and establish the effectiveness of a treatment.

Historically, medical researchers employed stable testing designs to evaluate long-term safety and efficacy. However, in the realm of AI, where systems are continuously learning, new benefits and risks can arise with each retraining and deployment. Thus, traditional RCTs may not adequately address AI risk assessments. Alternative frameworks, such as A/B testing, could offer valuable insights into AI system outcomes over time.

A/B testing has been widely used in product development for the past 15 years. This method involves treating different user groups differently to assess the impact of various features, such as which buttons receive the most clicks on a web page. Ronny Kohavi, the former head of experimentation at Bing, pioneered online continuous experimentation, where users are randomly allocated to either the current version of a site or a new version. This rigorous monitoring allows companies to iteratively enhance products while understanding the benefits of these changes against key metrics.

Many tech companies, including Bing, Uber, and Airbnb, have established systems for continually testing technological changes. This framework enables firms to evaluate not just business metrics like click-through rates and revenue but also identify potential harms, such as discrimination.

Effective AI safety measurement may look like this: a large bank may be concerned that a new pricing algorithm for personal loans unfairly disadvantages women. Although the model does not explicitly use gender, the bank suspects that proxies may unintentionally influence outcomes. To test this, the bank could create an experiment where the treatment group uses the new algorithm, while a control group receives decisions from a legacy model.

By ensuring demographics, like gender, are equally distributed between groups, the bank can measure any disparate impacts and assess the algorithm's fairness. Furthermore, AI exposure can be controlled through gradual rollouts of new features, allowing for measured risk management.

Alternatively, organizations like Microsoft utilize "red teaming," where employees challenge the AI system adversarially to identify its significant risks prior to broader deployment.

Ultimately, measuring AI safety fosters accountability. Unlike subjective explainability, evaluating an AI system’s outputs across diverse populations provides a quantifiable framework for assessing potential harms. This process establishes responsibility, enabling AI providers to ensure their systems operate effectively and ethically.

While explainability remains a focal point for AI providers and regulators, adopting methodologies from healthcare can help achieve the universal goal of safe and effective AI systems, working as intended.

Caroline O’Brien is the Chief Data Officer and Head of Product at Afiniti, a customer experience AI company. Elazer R. Edelman is the Edward J. Poitras Professor in Medical Engineering and Science at MIT, a Professor of Medicine at Harvard Medical School, and a Senior Attending Physician in the Coronary Care Unit at Brigham and Women’s Hospital in Boston.

Most people like

Find AI tools in YBX