“A red-teaming mindset brings curiosity to uncovering how best practices can go wrong.”
– Borhane Blili-Hamelin
Description
Since the release of ChatGPT in 2022, the viral spread of generative AI technologies has transformed public awareness of the societal impacts of AI. In light of the power and availability of these systems, regulators, technologists, and members of the public called for new safety practices. Red-teaming — a testing practice adapted from military, cybersecurity, and disinformation mitigation — has emerged as a key solution.
What exactly is generative AI (genAI) red-teaming? What strategies and standards should guide its implementation? And how can it protect the public interest? In this discussion, Lama Ahmad, Camille François, Tarleton Gillespie, Briana Vecchione, and Borhane Blili-Hamelin examine red-teaming’s place in the evolving landscape of genAI evaluation and governance. Their remarks draw on a report by Data & Society and AI Risk and Vulnerability Alliance (ARVA), a nonprofit that aims to empower communities to recognize, diagnose, and manage harmful flaws in AI. The report, Red-Teaming in the Public Interest, investigates how red-teaming methods are being adapted to confront uncertainty about flaws in systems and to encourage public engagement with the evaluation and oversight of genAI systems. Red-teaming offers a flexible approach to uncovering a wide range of problems with genAI models. It also offers new opportunities for incorporating diverse communities into AI governance practices.
The conversation wrestles with questions of who gets to define “harm,” “flaw,” or “model misbehavior,” and considers what public-facing red-teaming means for public participation in AI safety. Ultimately, this report and discussion present a vision of red-teaming as an area of public interest sociotechnical experimentation.
Speakers
Lama Ahmad is technical program manager of trustworthy AI at OpenAI, where she leads efforts on external assessments of AI systems’ impacts on society. Her work as part of the safety systems team includes leading the researcher access program to enable the study of key areas related to the responsible deployment of AI and mitigating risks associated with such systems, as well as OpenAI’s red-teaming efforts, third party assessment and auditing, and public input projects.
Camille François is associate professor of practice of international and public affairs at Columbia University. She specializes in how organized actors use digital technologies to harm society and individuals. She has advised governments and parliamentary committees on both sides of the Atlantic — from investigating Russian interference in the 2016 US presidential election on behalf of the US Senate Select Intelligence Committee, to leading the French government’s recent inquiry into the economic opportunities and social challenges presented by the metaverse. She currently serves as the senior director for trust and safety at Niantic, a pioneering augmented reality and gaming company.
Tarleton Gillespie is a senior principal researcher at Microsoft Research, an affiliated associate professor in the Department of Communication and Department of Information Science at Cornell University, author of Wired Shut: Copyright and the Shape of Digital Culture (MIT, 2007), co-editor of Media Technologies: Essays on Communication, Materiality, and Society (MIT, 2014), and author of Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media (Yale, 2018).
Briana Vecchione | Google Scholar
Briana Vecchione is a technical researcher with Data & Society’s Algorithmic Impact Methods Lab (AIMLab) whose work focuses on auditing and accountability in algorithmic systems, particularly at the intersection of social justice, participatory design, and policy. She holds a Ph.D. from the Department of Computing and Information Science at Cornell University. Her work has been supported by a Facebook Fellowship, and grants from the MacArthur Foundation, Mozilla Foundation, and Notre Dame-IBM Tech Ethics Lab.
Moderator
Borhane Blili-Hamelin | LinkedIn
Borhane Blili-Hamelin, a Data & Society affiliate, works on improving AI governance through research, auditing, and policy. His research takes a practical, qualitative, and philosophical lens to technical areas like machine learning evaluation, auditing, red-teaming, risk assessments, and incident and vulnerability reporting. He co-leads the AI Risk and Vulnerability Alliance — the nonprofit home of the AI Vulnerability Database (AVID) — and is a senior consultant at BABL AI, where he conducts audits for compliance with regulations and standards across the US and Europe. His work has been supported by grants from the Brown Institute for Media Innovation and the Omidyar Network, and published at FAccT, AIES, ICML, and IEEE-USA. He earned his PhD in philosophy from Columbia University.
Resources
- “Red-Teaming for Generative AI: Silver Bullet or Security Theater?” | Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society
- “AI Red-Teaming is a Sociotechnical System. Now What?” | arXiv:2412.09751
- “Under the Radar? Examining the Evaluation of Foundation Models” | Ada Lovelace Institute
- “What’s the Worst AI Can Do? This Team is Finding Out” | The Journal podcast
- “Can We Red Team Our Way to AI Accountability?” | TechPolicy Press
- Red Teaming AI Systems | OpenAI virtual event
- “OpenAI’s Approach to External Red Teaming for AI Models and Systems” | Ahmad et al. 2024
References
- Advancing Red-teaming with People and AI | OpenAI
- Bug Bounties For Algorithmic Harms? | Algorithmic Justice League
- Red Team Wisdom From Experts | Council on Foreign Relations
- Arvind Narayanan | LinkedIn
Credit
Production: CJ Brody Landow
Co-Production: Tunika Onnekikami
Co-Curation: Ranjit Singh, Jacob Metcalf, Borhane Blili-Hamelin
Web: Alessa Erawan
Editorial: Eryn Loeb
Video Editor: Tunika Onnekikami
Design: Surbhi Chawla
Transcript Editor: Sarika Ram