Online DatabiteFebruary 20 2025

Red-Teaming Generative AI Harm

Lama Ahmad
Camille François
Tarleton Gillespie
Briana Vecchione
Moderated by Borhane Blili-Hamelin

“A red-teaming mindset brings curiosity to uncovering how best practices can go wrong.” 

– Borhane Blili-Hamelin

Description

Since the release of ChatGPT in 2022, the viral spread of generative AI technologies has transformed public awareness of the societal impacts of AI. In light of the power and availability of these systems, regulators, technologists, and members of the public called for new safety practices. Red-teaming — a testing practice adapted from military, cybersecurity, and disinformation mitigation — has emerged as a key solution. 

What exactly is generative AI (genAI) red-teaming? What strategies and standards should guide its implementation? And how can it protect the public interest? In this discussion, Lama Ahmad, Camille François, Tarleton Gillespie, Briana Vecchione, and Borhane Blili-Hamelin examine red-teaming’s place in the evolving landscape of genAI evaluation and governance. Their remarks draw on a report by Data & Society and AI Risk and Vulnerability Alliance (ARVA), a nonprofit that aims to empower communities to recognize, diagnose, and manage harmful flaws in AI. The report, Red-Teaming in the Public Interest, investigates how red-teaming methods are being adapted to confront uncertainty about flaws in systems and to encourage public engagement with the evaluation and oversight of genAI systems. Red-teaming offers a flexible approach to uncovering a wide range of problems with genAI models. It also offers new opportunities for incorporating diverse communities into AI governance practices. 

The conversation wrestles with questions of who gets to define “harm,” “flaw,” or “model misbehavior,” and considers what public-facing red-teaming means for public participation in AI safety. Ultimately, this report and discussion present a vision of red-teaming as an area of public interest sociotechnical experimentation.

 

Speakers

Lama Ahmad 

Lama Ahmad is technical program manager of trustworthy AI at OpenAI, where she leads efforts on external assessments of AI systems’ impacts on society. Her work as part of the safety systems team includes leading the researcher access program to enable the study of key areas related to the responsible deployment of AI and mitigating risks associated with such systems, as well as OpenAI’s red-teaming efforts, third party assessment and auditing, and public input projects. 

 

Camille François 

Camille François is associate professor of practice of international and public affairs at Columbia University. She specializes in how organized actors use digital technologies to harm society and individuals. She has advised governments and parliamentary committees on both sides of the Atlantic — from investigating Russian interference in the 2016 US presidential election on behalf of the US Senate Select Intelligence Committee, to leading the French government’s recent inquiry into the economic opportunities and social challenges presented by the metaverse. She currently serves as the senior director for trust and safety at Niantic, a pioneering augmented reality and gaming company. 

 

Tarleton Gillespie

Tarleton Gillespie is a senior principal researcher at Microsoft Research, an affiliated associate professor in the Department of Communication and Department of Information Science at Cornell University, author of Wired Shut: Copyright and the Shape of Digital Culture (MIT, 2007), co-editor of Media Technologies: Essays on Communication, Materiality, and Society (MIT, 2014), and author of Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media (Yale, 2018).

 

Briana Vecchione | Google Scholar

Briana Vecchione is a technical researcher with Data & Society’s Algorithmic Impact Methods Lab (AIMLab) whose work focuses on auditing and accountability in algorithmic systems, particularly at the intersection of social justice, participatory design, and policy. She holds a Ph.D. from the Department of Computing and Information Science at Cornell University. Her work has been supported by a Facebook Fellowship, and grants from the MacArthur Foundation, Mozilla Foundation, and Notre Dame-IBM Tech Ethics Lab.

Moderator

Borhane Blili-Hamelin | LinkedIn

Borhane Blili-Hamelin, a Data & Society affiliate, works on improving AI governance through research, auditing, and policy. His research takes a practical, qualitative, and philosophical lens to technical areas like machine learning evaluation, auditing, red-teaming, risk assessments, and incident and vulnerability reporting.  He co-leads the AI Risk and Vulnerability Alliance — the nonprofit home of the AI Vulnerability Database (AVID) — and is a senior consultant at BABL AI, where he conducts audits for compliance with regulations and standards across the US and Europe. His work has been supported by grants from the Brown Institute for Media Innovation and the Omidyar Network, and published at FAccT, AIES, ICML, and IEEE-USA. He earned his PhD in philosophy from Columbia University.

Resources

Credit

Production: CJ Brody Landow

Co-Production: Tunika Onnekikami

Co-Curation: Ranjit Singh, Jacob Metcalf, Borhane Blili-Hamelin

Web: Alessa Erawan

Editorial: Eryn Loeb

Video Editor: Tunika Onnekikami

Design: Surbhi Chawla

Transcript Editor: Sarika Ram