NIST Unveils New Platform for Evaluating Generative AI Technologies

The National Institute of Standards and Technology (NIST), part of the U.S. Commerce Department and tasked with developing and testing technology for government and public use, announced the launch of NIST GenAI on Monday. This innovative program focuses on evaluating generative AI technologies, including those that generate text and images.

NIST GenAI aims to establish benchmarks, advance “content authenticity” detection systems (such as deepfake detection), and promote software that identifies the origin of misleading AI-generated information, as detailed on the newly launched NIST GenAI website and in a press release.

According to the press release, “The NIST GenAI program will present a series of challenge problems aimed at assessing and measuring the capabilities and limitations of generative AI technologies.” These evaluations will help develop strategies to uphold information integrity and guide the safe and responsible use of digital content.

NIST GenAI’s inaugural project is a pilot study designed to create systems that can effectively distinguish between human-generated and AI-generated media, beginning with text. Although various services claim to detect deepfakes, tests have revealed inconsistencies, especially regarding text. NIST GenAI is calling for submissions from academic institutions, industries, and research labs for two types of systems: “generators” (AI content creation systems) and “discriminators” (systems that identify AI-generated content).

Participants tasked with generating content must create 250-word summaries based on a given topic and set of documents, while those developing discriminators will work to determine if a summary is likely AI-generated. To maintain fairness, NIST GenAI will supply the necessary data for testing. Systems that utilize publicly available data and do not adhere to applicable laws and regulations will not be permitted, as stated by NIST.

Registration for the pilot phase will open on May 1, with the first of two rounds set to conclude on August 2. Final results from this study are projected to be published in February 2025.

The launch of NIST GenAI and its focus on deepfake detection occur in conjunction with the surging prevalence of AI-generated misinformation. Data from Clarity, a deepfake detection firm, indicates that there has been a staggering 900% increase in the creation and publication of deepfakes this year compared to the same period last year, raising significant concerns. A recent YouGov poll found that 85% of Americans expressed worries about the spread of misleading deepfakes online.

NIST GenAI's introduction is part of the agency's response to President Joe Biden's executive order on AI, which mandates enhanced transparency from AI companies regarding their models and sets new standards for labeling AI-generated content.

Additionally, the launch marks the first AI-related initiative from NIST since the appointment of Paul Christiano, a former OpenAI researcher, to its AI Safety Institute.

Christiano's selection has sparked debate due to his controversial “doomerist” views, including predictions that there’s a 50% chance AI development could lead to human extinction. Critics, reportedly including some NIST scientists, worry that Christiano may steer the AI Safety Institute toward focusing on speculative scenarios rather than addressing immediate AI risks.

NIST asserts that NIST GenAI will contribute valuable insights to inform the work of the AI Safety Institute.

Most people like

Find AI tools in YBX