Measuring Justice: Field Notes on Algorithmic Impact Assessments

Policy and Accountability

Measuring Justice: Field Notes on Algorithmic Impact Assessments

A look at some of the problems and questions framing our research

January 10, 2024

Algorithmic impact assessments (AIAs) are at an inflection point: amid increasing recognition of AIAs as an important tool for accountability, recent or pending legislation (including the Algorithmic Accountability Act of 2022, California’s AB 331, and the EU AI Act) could make AIAs mandatory in the EU, Brazil, Canada, and the United States. AIAs build on other kinds of impact assessments such as human rights, privacy, and environmental impact assessments. In the meantime, AIAs are not yet legally mandated and there are few public examples of those that have been conducted. Data & Society’s AIMLab is working to counter this opacity, experimenting with what it looks like to assess how algorithms impact people’s lives, choices, and access to essential services. For our team of researchers, all of whom are deeply attuned to ethics of the research process, it’s both exhilarating and overwhelming to be part of a field of inquiry that is still very much in formation.

With experiences working on sustainability product teams in the tech industry, and for green tech-related nonprofits, I am particularly focused on the problem of measuring technologies’ environmental and social impacts as the climate crisis becomes undeniable and resource-intensive generative AI takes off. In 2024, because of EU regulation, most companies will have to begin reporting their greenhouse gas emissions, and the success of those efforts will depend on measurement tools and industry-wide standards, along with internal managerial support within various organizations. Some of the required reporting will be complicated by external factors; for example, one category of emissions, Scope 3, is notoriously difficult to measure because of supply chain complexities. The same kinds of complexities will affect efforts to use AIAs to assess technology’s downstream impacts on different communities and ecosystems around the world. In fact, it is not yet clear how algorithmic impact assessments will work in practice and how they can best capture the sociotechnical aspects of AI production and deployment.

As the AIMLab team lays the foundation for this work, seeking to understand the gaps between policy and implementation, and between ethical frameworks and actual practices, our questions are at once theoretical and methodological. What is an impact and how do we go about measuring it? And if there are measurable harms attached to a particular technology, how do we then go about mitigating them? These questions require us to confront a fundamental disconnect between how developers and even policymakers may want to address the problem of algorithmic assessment and the somewhat more complicated realities for impacted communities on the ground. For companies and policymakers alike, having a simple rubric and tools for measuring impacts is of central importance. And, practically speaking, it is crucial for the impact assessment process to be relatively efficient and widely applicable for multiple technologies and social and cultural contexts. But as ethnographers of AI in the majority world and DEI in AI ethics spaces, practiced auditors, STS-informed AI accountability scholars, and participatory methods practitioners, we at AIMLab are interested in balancing the desire to measure, categorize, and report potential harms in a streamlined fashion with the messy realities of lived experience. We know that when people and ecologies meet technical systems, there will always be unanticipated consequences. So how do we identify, document, and push back against these sometimes ambient harms? Among other things, our process involves careful relationship building and multi-disciplinary translation, as we attempt to reconcile the internal metrics of companies and organizations with a justice-oriented sociotechnical analysis.

Reflecting on the history of environmental impact assessments (EIAs) offers a glimpse of how, as researchers and advocates, we can push for robust, holistic forms of accountability. As a result of the National Environmental Protection Act (NEPA) of 1969, in 1970, soon after the Trans-Alaska Pipeline was permitted but before construction began, researchers conducted an environmental impact statement (EIS), a more intensive version of an environmental impact assessment, to weigh the effects on wildlife, ecosystems, and human cultures against the supposed benefits of rapid development, looking for methods that would reduce environmental damage. As the environmental studies scholar Rabel J. Burdge explains in a 1991 article published in the journal Impact Assessment, Inuit leaders were concerned about how entire ways of life might be altered by a massive construction project, which also brought an influx of construction workers to the area. In that case, the EIS considered the pipeline’s ecological effects on people and wildlife, but did not fully examine the downstream social impacts on Inuit life. Other critics at the time lamented that published environmental impact statements advocated for forthcoming construction projects rather than fully assessing their potential harms. The Department of the Interior also pushed for going ahead with the project because of the nation’s need for oil. When all was said and done, the impact statement was used to justify the pipeline’s construction. Given this history of capture, AI researcher Theodora Dryer has called for a more justice oriented framework to assess technology’s climate and social impacts.

At AIMLab, one of our biggest concerns is avoiding even the perception that we might be rubber stamping potentially harmful systems. At the same time, we need access to systems in a number of sectors in order to put together a methodological toolkit for conducting AIAs across a range of case studies. How do AIAs work in small startups or civil society contexts as opposed to enterprise-scale companies? To find out, we’ll need to partner with researchers in industry settings and in city governments, as well as with labor rights organizations and grassroots activists who might be conducting more adversarial assessments of algorithmic systems.

With this context in mind, here are some of the current problems and questions framing our research:

How should we measure impacts? There is a need for regulation and standards when it comes to measuring sustainability and ethics claims. What methods and tools do researchers need to assess human rights impacts in tandem with the environmental impacts of algorithmic systems?

Environmental, social, and governance (ESG) reporting is a growing trend: the EU and many US states now require companies to release annual reports with statistics related to the UN’s Sustainable Development Goals, and so the vast majority of major companies released reports on their sustainability, human rights, and diversity, equity, and inclusion contributions in 2022. ESG compliance is an entire specialization within corporations, where corporate responsibility strategists balance environmental and human rights impact assessments and reporting. There are emergent standards and certifications for assessing sustainability claims in the tech industry, along with other industries. But many critics have also questioned the methods used to calculate net zero progress, and carbon offsets have proven to be unreliable at best.

Social impacts, on the other hand, often escape even the feeling or appearance of concreteness. How do we reconcile the desire to come up with clean metrics and measurements — to present companies and agencies with a simple checklist — with the complexity of impacts, especially given the wide range of potential use cases for some algorithmic systems? Illustrating this tension, Jutta Haider and Malte Rödl consider the “algorithmically embodied emissions” of recommender systems that inadvertently reinforce high-carbon behaviors even as they promise to reduce carbon emissions.

From the perspective of developers, the humans in the loop are a source of friction. Algorithmic systems are designed to reduce or remove friction. To take one example: There is often a contradiction between the sustainability and ethics goals that corporations espouse and their willingness to consider the broader sociotechnical implications of the technologies they are designing and deploying. An AI developer might have access to carbon aware tools that allow her to plan her ML training sessions for times of day when there is more renewable energy available on the grid, but if her manager and the larger organization do not prioritize reducing carbon emissions, the developer won’t necessarily be able to implement the tools and practices that might reduce the carbon intensity of her work. Developers’ work conditions and workplace hierarchies within organizations also shape the effectiveness of software tools that are designed to make the industry more sustainable or more responsible. Designing the ideal tool is far from enough to hold companies accountable.

Meanwhile, social justice and human rights advocates are of course focused on the social repercussions of technological systems; there is no system without the humans who participate in it, or who are acted upon by it. There will always be a gap between the expected uses of a technology and what it looks like on the ground. Our job as sociotechnical researchers is to understand that gap and to find ways of mitigating the harms that emerge from it.

When and where should impacts be assessed? AI implementation affects different groups at different times in a system’s life cycle, which can make it difficult to efficiently or accurately measure social impacts. But justice must be a central part of the impact assessment process. How can sociotechnical researchers make justice a factor in measuring impact across the life cycle?

Examining the entire life cycle of a system — from development and manufacturing to the breakdown of systems when they are sunsetted — is essential. And rather than including communities late in the process and adding focus groups after harms have already occurred, some organizations are attempting to include communities from the very beginning of technology design. Likewise, some companies and organizations are including marginalized groups in their impact assessment process from the beginning, and are considering environmental justice concerns alongside other factors. AI researchers Bogdana Rakova and Roel Dobbe put forth a framework for considering the intersection of social and environmental impacts through the lens of environmental and climate justice. As they argue, “without a broader and more integral lens addressing the couplings of algorithmic systems with socioecological and sociotechnical systems, the algorithmic impact assessment process is bound to miss how technical interventions threaten to exacerbate existing power imbalances and yield harmful consequences.”

Practically speaking, what are the geographic and temporal boundaries of an impact assessment? Physical and social impacts are observable and measurable at the project or community level, but may be harder to track at the regional or global level. Decisions made in one area can also affect another part of the world. This is especially true if we are assessing the full life cycle of technology development, implementation, and expiration. For example, e-waste is a problem because of how systems are designed, including planned obsolescence. This has far-reaching consequences in the majority world, where e-waste is a major polluter and part of local economies throughout Southeast Asia. How does it affect communities to look at the impacts of a system in the development stage and get in on the ground floor when a model is being built, as opposed to conducting impact assessments after a vendor has already produced and implemented a system? What would planning for the end of a system’s life look like?

Who is included in the impact assessment process? While we are in the midst of a participatory turn when it comes to research methods, there is still a lot of work to be done to ensure that community engagement is done properly. How do we as researchers, activists, and policymakers include community perspectives without tending towards extraction — and while navigating stakeholder fatigue?

Asking marginalized communities to be part of a conversation from the start of the design process instead of as an afterthought, during user testing, is certainly a step in the right direction. But there is also the deeper matter of building trust, and the need to account for wariness on the part of communities who rightly worry about exploitation and co-option. Using proxies for entire communities, or failing to consider sources of disagreement within them, can increase distrust and undermine the credibility of those seeking to assess impact, and thwart progress. Indeed, the term “community” itself can obfuscate differences and power differentials. Also always lurking is the conundrum of extraction.

So how might researchers consult with communities without overburdening them or asking them to do unpaid labor, or without asking the same civil rights and advocacy groups to participate again and again? How do we know if we are engaging the “right” communities in the first place, and what assumptions are implicit in researchers themselves deciding which communities to include while excluding others? Taking a richly historical view of participation, Abeba Birhane et al. argue that “greater clarity is needed on what participation is, who it is supposed to serve, how it can be used in the specific context of AI, and how it is related to the mechanisms and approaches already available.” The legal scholar Michele Gilman emphasizes the need to consider accessibility, childcare and other constraints, and remuneration as factors when asking people from vulnerable communities to take on more work in attempts at democratizing AI through participatory methods. And while large companies might be equipped to take on participatory work during their design and iteration process, many startups lack UX or corporate responsibility teams. What kind of standards might emerge in different organizations when it comes to including communities in AIAs?

As AIMLab continues our work in 2024, we will be assembling a community of practice, including fellow travelers who are also considering algorithmic impact assessments at the intersection of human rights and responsible and sustainable AI. We view participatory methods as central to our goals, and we hope to create space for others in our larger community to share strategies and best practices. If your organization is eager to get a head start on figuring out how to assess algorithmic systems and would like to partner with us, please get in touch: [email protected].