Ford Foundation blog | 05.30.17
D&S affiliate Wilneida Negrón details the role of bots and automation in activism today.
As everyone from advertisers to political adversaries jockey for attention, they are increasingly using automated technologies and processes to raise their own voices or drown out others. In fact, 62 percent of all Internet traffic is made up of programs acting on their own to analyze information, find vulnerabilities, or spread messages. Up to 48 million of Twitter’s 320 million users are bots, or applications that perform automated tasks. Some bots post beautiful art from museum collections, while some spread abuse and misinformation instead. Automation itself isn’t cutting edge, but the prevalence and sophistication of how automated tools interact with users is.
Harvard Business Review | 05.16.17
D&S researcher Mark Latonero provides an overview of the role of large tech companies in refugee crises.
While the 40-page brief is filled with arguments in support of immigration, it hardly speaks about refugees, except to note that those seeking protection should be welcomed. Any multinational company with a diverse workforce would be concerned about limits to international hiring and employee travel. But tech companies should also be concerned about the refugee populations that depend on their digital services for safety and survival.
Harvard Business Review | 04.19.17
Jongbin Jung, Connor Concannon, D&S fellow Ravi Shroff, Sharad Goel, and Daniel G. Goldstein explore new methods for machine learning in criminal justice.
Simple rules certainly have their advantages, but one might reasonably wonder whether favoring simplicity means sacrificing performance. In many cases the answer, surprisingly, is no. We compared our simple rules to complex machine learning algorithms. In the case of judicial decisions, the risk chart above performed nearly identically to the best statistical risk assessment techniques. Replicating our analysis in 22 varied domains, we found that this phenomenon holds: Simple, transparent decision rules often perform on par with complex, opaque machine learning methods.
D&S fellow Mark Ackerman develops a checklist to address the sociotechnical issues demonstrated in Cathy O’Neil’s Weapons of Math Destruction.
These checklist items for socio-technical design are all important for policy as well. Yet the book makes it clear that not all “sins” can be reduced to checklist form. The book also explicates other issues that cannot easily be foreseen and are almost impossible for implementers to see in advance, even if well-intentioned. One example from the book is college rankings, where the attempt to be data-driven slowly created an ecology where universities and colleges paid more attention to the specific criteria used in the algorithm. In other situations, systems will be profit-generating in themselves, and therefore implemented, but suboptimal or societally harmful — this is especially true, as the book nicely points out, for systems that operate over time, as happened with mortgage pools. Efficiency may not be the only societal goal — there is also fairness, accountability, and justice. One of the strengths of the book is to point this out and make it quite clear.
Anil Dash analyzes how ‘teaching kids to code’ does not address the wider inequities that prevent diversity in tech.
Many tech companies are still terrible at inclusion in their hiring, a weakness which is even more unacceptable given the diversity of the younger generations we’re educating today. Many of the biggest, most prominent companies in Silicon Valley—including giants like Apple and Google—have illegally colluded against their employees to depress wages, so even employees who do get past the exclusionary hiring processes won’t necessarily end up in an environment where they’ll be paid fairly or have equal opportunity to advance. If the effort to educate many more programmers succeeds, simple math tells us that a massive increase in the number of people qualified to work on technology would only drive down today’s high wages and outrageously generous benefits. (Say goodbye to the free massages!)
IEEE Annals of the History of Computing | 09.01.16
D&S post-doctoral scholar Caroline Jack analyzes the history of how businesses donated personal computers to classrooms as a way to engage students.
In late 1982, the corporate-funded business education nonprofit Junior Achievement (JA) distributed 121 donated personal computers to classrooms across the United States as part of its new high school course, Applied Economics. Studying JA’s use of computers in Applied Economics reveals how a corporate-sponsored nonprofit group used personal computers to engage students, adapt its traditional outreach methods to the classroom, and bolster an appreciation of private enterprise in American economic life. Mapping the history of how business advocacy and education groups came to adopt software as a means of representing work and commerce offers a new perspective on how systems of cultural meaning have been attached to, and expressed through, computers and computing.
Code is key to civic life, but we need to start looking under the hood and thinking about the externalities of our coding practices, especially as we’re building code as fast as possible with few checks and balances.
Points: “Be Careful What You Code For” is danah boyd’s talk from Personal Democracy Forum 2016 (June 9, 2016); her remarks have been modified for Points. danah exhorts us to mind the externalities of code and proposes audits as a way to reckon with the effects of code in high stakes areas like policing. Video is available here.
points | 06.13.16
In this Points piece “Real Life Harms of Student Data,” D&S researcher Mikaela Pitcan argues that assessing real harms connected with student data forces us to acknowledge the mundane, human causes. And she asks: “What do we do now?”
“Overall, cases where student data has led to harms aren’t about data per se, but about the way that people interact with the data…
…accidental data leaks, data being hacked, data being lost, school officials using off-campus information for discipline, oversight in planning for data handling when companies are sold, and faulty data and data systems resulting in negative outcomes. Let’s break it down.”
ProPublica | 05.23.16
D&S fellow Surya Mattu investigated bias in risk assessments, algorithmically generated scores predicting the likelihood of a person committing a future crime. These scores are increasingly used in courtrooms across America to inform decisions about who can be set free at every stage of the criminal justice system, from assigning bond amounts to fundamental decisions about a defendant’s freedom:
We obtained the risk scores assigned to more than 7,000 people arrested in Broward County, Florida, in 2013 and 2014 and checked to see how many were charged with new crimes over the next two years, the same benchmark used by the creators of the algorithm.
The score proved remarkably unreliable in forecasting violent crime: Only 20 percent of the people predicted to commit violent crimes actually went on to do so.
When a full range of crimes were taken into account — including misdemeanors such as driving with an expired license — the algorithm was somewhat more accurate than a coin flip. Of those deemed likely to re-offend, 61 percent were arrested for any subsequent crimes within two years.
We also turned up significant racial disparities, just as Holder feared. In forecasting who would re-offend, the algorithm made mistakes with black and white defendants at roughly the same rate but in very different ways.
- The formula was particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants.
- White defendants were mislabeled as low risk more often than black defendants.
Could this disparity be explained by defendants’ prior crimes or the type of crimes they were arrested for? No. We ran a statistical test that isolated the effect of race from criminal history and recidivism, as well as from defendants’ age and gender. Black defendants were still 77 percent more likely to be pegged as at higher risk of committing a future violent crime and 45 percent more likely to be predicted to commit a future crime of any kind.
Medium | 05.18.16
D&S Researcher Alex Rosenblat on the fallout of the Austin Transportation’s showdown with Uber and Lyft:
Uber allied with Lyft in Austin to lobby against an ordinance passed by the city council which requires ridehail drivers to undergo fingerprint-based background checks. The two companies spent $8.1 million combined to encourage (i.e. bombard with robo-texts) Austin voters to oppose the ordinance in a referendum vote called Proposition 1. If local cities take a stand against Uber or Lyft’s demand about background checks, and they prevail, that could produce a ripple effect in other cities that have regulatory demands. The local impact on Austin is a secondary concern to the global and national ambitions of imperial Uber and parochial Lyft. When they lost the vote on Prop. 1, they followed through on their threats to withdraw their services.
Medium | 04.29.16
D&S Researcher Alex Rosenblat examines and problematizes Uber’s stance against tipping and the resulting effects on Uber drivers.
D&S Advisor Ethan Zuckerman pushes back against a new myth developing around Bitcoin as a ready-made solution to complex humanitarian and international development problems around the globe:
Is Bitcoin really the best way to think about establishing a digital commons for financial transactions? Maybe not. The Bitcoin network requires large amounts of bandwidth to run and uses enormous amounts of power, which makes it challenging for people in the developing world to participate in mining or use the network reliably.
The challenges of financial inclusion in a place like Kenya are diverse, from the cost of sending and receiving money, to the difficulty in doing business across borders, and the concentrated power of Safaricom. Perhaps Bitcoin could spur financial innovation there. But there are no guarantees. Understanding what Bitcoin can do for people in the developing world will first require a better understanding of the people who live there.
Balkin.blogspot.com | 03.31.16
Reflections from D&S Affiliate Solon Barocas and Advisors Edward W. Felten and Joel Reidenberg on the recent “Unlocking the Black Box” Conference held on April 2 at Yale Law School:
Our work on accountable algorithms shows that transparency alone is not enough: we must have transparency of the right information about how a system works. Both transparency and the evaluation of computer systems as inscrutable black boxes, against which we can only test the relationship of inputs and outputs, both fail on their own to effect even the most basic procedural safeguards for automated decision making. And without a notion of procedural regularity on which to base analysis, it is fruitless to inquire as to a computer system’s fairness or compliance with norms of law, politics, or social acceptability. Fortunately, the tools of computer science provide the necessary means to build computer systems that are fully accountable. Both transparency and black-box testing play a part, but if we are to have accountable algorithms, we must design for this goal from the ground up.
D&S Researcher Alex Rosenblat unpacks the implications of Uber’s power to unilaterally set and change the rates passengers pay, the rates that drivers are paid, and the commission Uber takes. She also asks whether the conditions of driving for Uber are necessarily a form of “collusion”:
How drivers earn money is directly impacted by the policies and behavioral expectations Uber devises for how they interact with the Uber platform, and with Uber passengers. Drivers have to meet Uber’s performance targets in their local markets, such as a 90% ride acceptance rate, a low cancellation rate, like 5%, and maintain a high average rating that hovers at a minimum of 4.6/5 stars, often by performing according to Uber’s “recommended” etiquette. If they fall below the local performance targets, they risk deactivation(an Uber word for “temporarily suspended” or “fired”).
Aside from these implicit controls over how drivers interact with the system, Uber has a policy of blind passenger acceptance through its automated dispatcher. The system is designed to encourage drivers to accept all rides by hiding the destination of the passenger, generating goodwill for the company and support from its passenger base. In effect, not only does Uber set the price — Uber also requires drivers to accept those fares when drivers might otherwise reject them for being unprofitable, such as short, minimum fare rides. Drivers also receive deactivation warnings for displaying a preference for surge fares over non-surge fares. As such, their eligibility to work on the Uber platform could plausibly be construed as contingent on the very conditions that would violate anti-trust laws: if they are not in compliance with Uber’s system for setting prices, they risk deactivation. Those restrictions on drivers’ independence really calls into question their ability to act freely as entrepreneurs.
Medium | 03.27.16
D&S Advisor Andrew McGlaughlin reflects on Facebook’s approach to implementing their Free Basics program:
In opening a door to the Internet, Facebook doesn’t need to be a gatekeeper The good news, though, is that Facebook could quite easily fix its two core flaws and move forward with a program that is effective, widely supported, and consistent with Internet ideals and good public policy.
In this gatekeeper-less model, neither the user nor the online service has to ask Facebook’s permission to connect with each other. And that’s what makes all the difference. Rather than referring to an approved set of ~300 companies, the word “Basics” in Free Basics would denote any site or service anywhere in the world that provides a standards-compliant, low-bandwidth, mobile-optimized version.
Student data can and has served as an equalizer, but it also has the potential to perpetuate discriminatory practices. In order to leverage student data to move toward equity in education, researchers, parents, and educators must be aware of the ways in which data serves to equalize as well as disenfranchise. Common discourse surrounding data as an equalizer can fall along a spectrum of “yes, it’s the fix” or “this will never work.” Reality is more complicated than that.
Points: “Does data-driven learning improve equity?” That depends, says Mikaela Pitcan in this Points original. Starting assumptions, actual data use practices, interpretation, context context context — all complicate the story around education data and must be kept in mind if equity is our objective.
paper | 03.10.16
D&S Fellow Sorelle Friedler and D&S Affiliate Ifeoma Ajunwa argue in this essay that well settled legal doctrines that prohibit discrimination against job applicants on the basis of sex or race dictate an examination of how algorithms are employed in the hiring process with the specific goals of: 1) predicting whether such algorithmic decision-making could generate decisions having a disparate impact on protected classes; and 2) repairing input data in such a way as to prevent disparate impact from algorithmic decision-making.
Major advances in machine learning have encouraged corporations to rely on Big Data and algorithmic decision making with the presumption that such decisions are efficient and impartial. In this Essay, we show that protected information that is encoded in seemingly facially neutral data could be predicted with high accuracy by algorithms and employed in the decision-making process, thus resulting in a disparate impact on protected classes. We then demonstrate how it is possible to repair the data so that any algorithm trained on that data would make non-discriminatory decisions. Since this data modification is done before decisions are applied to any individuals, this process can be applied without requiring the reversal of decisions. We make the legal argument that such data modifications should be mandated as an anti-discriminatory measure. And akin to Professor Ayres’ and Professor Gerarda’s Fair Employment Mark, such data repair that is preventative of disparate impact would be certifiable by teams of lawyers working in tandem with software engineers and data scientists. Finally, we anticipate the business necessity defense that such data modifications could degrade the accuracy of algorithmic decision-making. While we find evidence for this trade-off, we also found that on one data set it was possible to modify the data so that despite previous decisions having had a disparate impact under the four-fifths standard, any subsequent decision-making algorithm was necessarily non-discriminatory while retaining essentially the same accuracy. Such an algorithmic “repair” could be used to refute a business necessity defense by showing that algorithms trained on modified data can still make decisions consistent with their previous outcomes.
University of Pennsylvania Law Review | 03.02.16
D&S Affiliate Solon Barocas and Advisors Edward W. Felten and Joel Reidenberg collaborate on a paper outlining the importance of algorithmic accountability and fairness, proposing several tools that can be used when designing decision-making processes.
Abstract: Many important decisions historically made by people are now made by computers. Algorithms count votes, approve loan and credit card applications, target citizens or neighborhoods for police scrutiny, select taxpayers for an IRS audit, and grant or deny immigration visas.
The accountability mechanisms and legal standards that govern such decision processes have not kept pace with technology. The tools currently available to policymakers, legislators, and courts were developed to oversee human decision-makers and often fail when applied to computers instead: for example, how do you judge the intent of a piece of software? Additional approaches are needed to make automated decision systems — with their potentially incorrect, unjustified or unfair results — accountable and governable. This Article reveals a new technological toolkit to verify that automated decisions comply with key standards of legal fairness.
We challenge the dominant position in the legal literature that transparency will solve these problems. Disclosure of source code is often neither necessary (because of alternative techniques from computer science) nor sufficient (because of the complexity of code) to demonstrate the fairness of a process. Furthermore, transparency may be undesirable, such as when it permits tax cheats or terrorists to game the systems determining audits or security screening.
The central issue is how to assure the interests of citizens, and society as a whole, in making these processes more accountable. This Article argues that technology is creating new opportunities — more subtle and flexible than total transparency — to design decision-making algorithms so that they better align with legal and policy objectives. Doing so will improve not only the current governance of algorithms, but also — in certain cases — the governance of decision-making in general. The implicit (or explicit) biases of human decision-makers can be difficult to find and root out, but we can peer into the “brain” of an algorithm: computational processes and purpose specifications can be declared prior to use and verified afterwards.
The technological tools introduced in this Article apply widely. They can be used in designing decision-making processes from both the private and public sectors, and they can be tailored to verify different characteristics as desired by decision-makers, regulators, or the public. By forcing a more careful consideration of the effects of decision rules, they also engender policy discussions and closer looks at legal standards. As such, these tools have far-reaching implications throughout law and society.
Part I of this Article provides an accessible and concise introduction to foundational computer science concepts that can be used to verify and demonstrate compliance with key standards of legal fairness for automated decisions without revealing key attributes of the decision or the process by which the decision was reached. Part II then describes how these techniques can assure that decisions are made with the key governance attribute of procedural regularity, meaning that decisions are made under an announced set of rules consistently applied in each case. We demonstrate how this approach could be used to redesign and resolve issues with the State Department’s diversity visa lottery. In Part III, we go further and explore how other computational techniques can assure that automated decisions preserve fidelity to substantive legal and policy choices. We show how these tools may be used to assure that certain kinds of unjust discrimination are avoided and that automated decision processes behave in ways that comport with the social or legal standards that govern the decision. We also show how algorithmic decision-making may even complicate existing doctrines of disparate treatment and disparate impact, and we discuss some recent computer science work on detecting and removing discrimination in algorithms, especially in the context of big data and machine learning. And lastly in Part IV, we propose an agenda to further synergistic collaboration between computer science, law and policy to advance the design of automated decision processes for accountability.
D&S Board Member Anil Dash contrasts two recent approaches to making internet connectivity more widely available. Comparing the efforts to build consensus behind Facebook’s Free Basics initiative to LinkNYC, the recently-launched program to bring free broadband wifi to New York City, Dash views each situation as a compelling example of who gets heard, and when, any time a big institution tries to create a technology infrastructure to serve millions of people.
There’s one key lesson we can take from these two attempts to connect millions of people to the Internet: it’s about building trust. Technology infrastructure can be good or bad, extractive or supportive, a lifeline or a raw deal. Objections to new infrastructure are often dismissed by the people pushing them, but people’s concerns are seldom simply about advertising or bring skeptical of corporations. There are often very good reasons to look a gift horse in the mouth.
Whether we believe in the positive potential of getting connected simply boils down to whether we feel the people providing that infrastructure have truly listened to us. The good news is, we have clear examples of how to do exactly that.
D&S fellow Mimi Onuoha thinks through the implications of the moment of data collection and offers a compact set of reminders for those who work with and think about data.
The conceptual, practical, and ethical issues surrounding “big data” and data in general begin at the very moment of data collection. Particularly when the data concern people, not enough attention is paid to the realities entangled within that significant moment and spreading out from it.
The point of data collection is a unique site for unpacking change, abuse, unfairness, bias, and potential. We can’t talk about responsible data without talking about the moment when data becomes data.
D&S Fellow Mark Latonero considers the digital infrastructure for movement of refugees — the social media platforms, mobile apps, online maps, instant messaging, translation websites, wire money transfers, cell phone charging stations, and Wi-Fi hotspots — that is accelerating the massive flow of people from places like Syria, Iraq, and Afghanistan to Greece, Germany, and Norway. He argues that while the tools that underpin this passage provide many benefits, they are also used to exploit refugees and raise serious questions about surveillance.
Refugees are among the world’s most vulnerable people. Studies have shown that undue surveillance towards marginalized populations can drive them off the grid. Both perceived and real fears around data collection may result in refugees seeking unauthorized routes to European destinations. This avoidance strategy can make them invisible to officials and more susceptible to criminal enterprises. Data collection on refugees should balance security and public safety with the need to preserve human dignity and rights. Governments and refugee agencies need to establish trust when collecting data from refugees. Technology companies should acknowledge their platforms are used by refugees and smugglers alike and create better user safety measures. As governments and leaders coordinate a response to the crisis, appropriate safeguards around data and technology need to be put in place to ensure the digital passage is safe and secure.
writeup | 06.19.15
On May 19, 2015 a group of about 20 individuals gathered at New America in Washington, DC for a discussion co-hosted by The Leadership Conference on Civil and Human Rights, Data & Society Research Institute, Upturn, and New America’s Open Technology Institute. The group was composed of technologists, researchers, civil rights advocates, and law enforcement representatives with the goal to broaden the discussion surrounding police worn body cameras within their respective fields and to understand the various communities’ interests and concerns. The series of discussions focused on what the technology behind police cameras consists of, how the cameras can be implemented to protect civil rights and public safety, and what the consequences of implementation might be.
D&S founder danah boyd considers recent efforts at reforming laws around student privacy and what it would mean to actually consider the privacy rights of the most marginalized students.
The threats that poor youth face? That youth of color face? And the trade-offs they make in a hypersurveilled world? What would it take to get people to care about how we keep building out infrastructure and backdoors to track low-status youth in new ways? It saddens me that the conversation is constructed as being about student privacy, but it’s really about who has the right to monitor which youth. And, as always, we allow certain actors to continue asserting power over youth.
The Atlantic | 05.15.15
Excerpt: “Police-worn body cameras are coming. Support for them comes from stakeholders who often take opposing views. Law enforcement wants them, many politicians are pushing for them, and communities that already have a strong police presence in their neighborhoods are demanding that the police get cameras now. Civil-rights groups are advocating for them. The White House is funding them. The public is in favor of them. The collective — albeit, not universal — sentiment is that body cameras are a necessary and important solution to the rising concerns about fatal encounters between police and black men.
“As researchers who have spent the last few months analyzing what is known about body cams, we understand the reasons for this consensus, but we’re nervous that there will be unexpected and undesirable outcomes. On one hand, we’re worried that these expensive technologies will do little to curb systemic abuse. But what really scares us is the possibility that they may magnify injustice rather than help eradicate it. We support safeguards being put in place. But the cameras are not a proven technology, and we’re worried that too much is hinging on them being a silver bullet to a very serious problem. Our concerns stem from three major issues:
“But as history tells us, camera evidence does not an indictment make.”
D&S advisor, Janet Vertesi discusses the difficulty with visual evidence in criminal indictments and the power of visual suggestibility. Offering evidence as to why police worn body cameras may not be the panacea they have recently been portrayed as.
Public Understanding of Science | 04.01.15
Abstract: When economists ask questions about basic financial principles, most ordinary people answer incorrectly. Economic experts call this condition “financial illiteracy,” which suggests that poor financial outcomes are due to a personal deficit of reading-related skills. The analogy to reading is compelling because it suggests that we can teach our way out of population-wide financial failure. In this comment, we explain why the idea of literacy appeals to policy makers in the advanced industrial nations. But we also show that the narrow skill set laid out by economists does not satisfy the politically inclusive definition of literacy that literacy studies fought for. We identify several channels through which people engage with ideas about finance and demonstrate that not all forms of literacy will lead people to the educational content prescribed by academic economists. We argue that truly financial literate people can defy the demands of financial theory and financial institutions.
“Seeta Gangadharan is a Senior Research Fellow at the Open Technology Institute in Washington DC [and a D&S fellow]. She discusses the automated systems, known as algorithms, that are replacing human discretion more and more often. Algorithms are a simple set of mathematical rules embedded in the software to complete a task. They allow google to rank pages according to their relevance and popularity when people conduct an internet search, and allow internet sites like Amazon and Netflix to monitor our purchases and suggest related items. But open technology advocates say there is not enough oversight of these algorithms, which can perpetuating poverty and inequality.”
Kathryn Ryan, The computer algorithms that run our lives, Nine To Noon (Radio New Zealand), 23 February 2015
interview | 02.18.15
“[D&S advisor] Dr. Alondra Nelson studies gender and black studies at the intersection of science, technology, and medicine. She is the author of numerous articles, including, ‘Bio Science: Genetic Genealogy Testing and the Pursuit of African Ancestry,’ as well as Body and Soul: The Black Panther Party and the Fight Against Medical Discrimination and the forthcoming The Social Life of DNA. We talked to her in the Trustees’ Room at Columbia University where she is professor of sociology and gender studies and the Dean of Social Science.”
Jamie Courville, Interview with Alondra Nelson: Race + Gender + Technology + Medicine, JSTOR Daily, February 18, 2015
primer | 10.30.14
Discrimination and racial disparities persist at every stage of the U.S. criminal justice system, from policing to trials to sentencing. The United States incarcerates a higher percentage of its population than any of its peer countries, with 2.2 million people behind bars. The criminal justice system disproportionately harms communities of color: while they make up 30 percent of the U.S. population, they represent 60 percent of the incarcerated population. There has been some discussion of how “big data” can be used to remedy inequalities in the criminal justice system; civil rights advocates recognize potential benefits but remained fundamentally concerned that data-oriented approaches are being designed and applied in ways that also disproportionately harms those who are already marginalized by criminal justice processes.
This document is a workshop primer from Data & Civil Rights: Why “Big Data” is a Civil Rights Issue.
New York Times | 08.07.14
In this op-ed, Data & Society fellow Seeta Peña Gangadharan argues that the “rise of commercial data profiling is exacerbating existing inequities in society and could turn de facto discrimination into a high-tech enterprise.” She urges us to “respond to this digital discrimination by making civil rights a core driver of data-powered innovations and getting companies to share best practices in detecting and avoiding discriminatory outcomes.”
The availability of data is not evenly distributed. Some organizations, agencies, and sectors are better equipped to gather, use, and analyze data than others. If data is transformative, what are the consequences of defense and security agencies having greater capacity to leverage data than, say, education or social services? Financial wherewithal, technical capacity, and political determinants all affect where data is employed. As data and analytics emerge, who benefits and who doesn’t, both at the individual level and the institutional level? What about the asymmetries between those who provide the data and those who collect it? How does uneven data access affect broader issues of inequality? In what ways does data magnify or combat asymmetries in power?
This document is a workshop primer from The Social, Cultural & Ethical Dimensions of “Big Data”.