The Social, Cultural & Ethical Dimensions of “Big Data”

Conference ⋅ March 17 2014

The Social, Cultural & Ethical Dimensions of “Big Data”

Data & Society
White House Office of Science and Technology Policy
NYU’s Information Law Institute

On March 17, 2014, the Data & Society Research Institute, the White House Office of Science and Technology Policy (OSTP), and New York University’s Information Law Institute co-hosted “The Social, Cultural, & Ethical Dimensions of ‘Big Data.’“

The purpose of the event was to convene key stakeholders and thought leaders from across academia, government, industry, and civil society to examine the social, cultural, and ethical implications of “big data,” with an eye to both the challenges and opportunities presented by the phenomenon.

The event was one of three conferences that OSTP co-hosted with academic institutions across the country to examine key questions on the use of “big data” and the future of privacy. Other events included a conference organized by the Massachusetts Institute of Technology (MIT) Big Data Initiative at CSAIL, and the MIT Information Policy Project that focused on the technical aspects underpinning privacy (Advancing the State of the Art in Technology and Practice) and an event organized by the School of Information with the Berkeley Center for Law and Technology at University of California-Berkeley to explore the legal and policy issues raised by big data (Values and Governance). These were all part of efforts by the Obama Administration to review the implications of collecting, analyzing, and using massive or complex data sets for privacy, the economy, and public policy.

You can read a summary of the conference discussions here.

The discussions at our conference helped inform the White House’s report at the culmination of their 90-day review: Big Data: Seizing Opportunities, Preserving Values.

Session 1

Session 2

Workshops

Algorithmic Accountability

Accountability is fundamentally about checks and balances to power. In theory, both government and corporations are kept accountable through social, economic, and political mechanisms. Journalism and public advocates serve as an additional tool to hold powerful institutions and individuals accountable. But in a world of data and algorithms, accountability is often murky. Beyond questions about whether the market is sufficient or governmental regulation is necessary, how should algorithms be held accountable? For example what is the role of the fourth estate in holding data-oriented practices accountable?

Primer
Notes

Data Supply Chains

As data moves between actors and organizations, what emerges is a data supply chain. Unlike manufacturing supply chains, transferred data is often duplicated in the process, challenging the essence of ownership. What does ethical data labor look like? How are the various stakeholders held accountable for being good data guardians? What does clean data transfer look like? What kinds of best practices can business and government put into place? What upstream rights to data providers have over downstream commercialization of their data?

Primer
Notes

Inequalities and Asymmetries

The availability of data is not evenly distributed. Some organizations, agencies, and sectors are better equipped to gather, use, and analyze data than others. If data is transformative, what are the consequences of defense and security agencies having greater capacity to leverage data than, say, education or social services? Financial wherewithal, technical capacity, and political determinants all affect where data is employed. As data and analytics emerge, who benefits and who doesn’t, both at the individual level and the institutional level? What about the asymmetries between those who provide the data and those who collect it? How does uneven data access affect broader issues of inequality? In what ways does data magnify or combat asymmetries in power?

Primer
Notes

Inferences and Connections

Data-oriented systems are inferring relationships between people based on genetic material, behavioral patterns (e.g., shared geography imputed by phone carriers), and performed associations (e.g., “friends” online or shared photographs). What responsibilities do entities who collect data that imputes connections have to those who are implicated by association? For example, as DNA and other biological materials are collected outside of medicine (e.g., at point of arrest, by informatics services like 23andme, for scientific inquiry), what rights do relatives (living, dead, and not-yet-born) have? In what contexts is it acceptable to act based on inferred associations and in which contexts is it not?

Primer
Notes

Interpretation Gone Wrong

Just because data can be made more accessible to broader audiences does not mean that those people are equipped to interpret what they see. Limited topical knowledge, statistical skills, and contextual awareness can prompt people to read inferences into, be afraid of, and otherwise misinterpret the data they are given. As more data is made more available, what other structures and procedures need to be in place to help people interpret what’s available?

Primer
Notes

Predicting Human Behavior

Countless highly accurate predictions can be made from trace data, with varying degrees of personal or societal consequence (e.g., search engines predict hospital admission, gaming companies can predict compulsive gambling problems, government agencies predict criminal activity). Predicting human behavior can be both hugely beneficial and deeply problematic depending on the context. What kinds of predictive privacy harms are emerging? And what are the implications for systems of oversight and due process protections? For example, what are the implications for employment, health care and policing when predictive models are involved? How should varied organizations address what they can predict?

Primer
Notes

Public Plenary

Additional Materials

Invited participants included diverse thought leaders from academia, civil society,
government, and industry.
View full schedule.
View production team credits.
Daytime participants suggested additional readings, which can be downloaded as a ZIP file.
All organizing materials can also be downloaded as a ZIP file.

This conference was organized by danah boyd (Data & Society Research Institute / Microsoft Research) with help from Helen Nissenbaum (New York University), Geoffrey C. Bowker (University of California-Irvine), and Kate Crawford (Microsoft Research / MIT Center for Civic Media).