videoApril 30 2015

How Anonymous Is Anonymity? Open Data Releases and Re-identification

Daniel Barth-Jones, Felix Wu

Databite No. 35

Is it possible to openly share important raw data sets while preserving the privacy of individuals whose personal information is captured in the data in anonymized form? Numerous recent reports have suggested that it is often possible to use just a few bits of additional information to “re-identify” individuals in purportedly anonymous data sets, even when names and other basic identifiers have been removed. In this conversation, two scholars with distinct perspectives on the “re-identification” threat, Felix Wu (Cardozo School of Law) and Daniel Barth-Jones (Columbia University Mailman School of Public Health), address the question of how to balance the risks of such privacy breaches with the potential insights into civic and public health problems that analyses of large collections of personally identifiable data may provide.

Download the slides from Daniel Barth-Jones’ presentation here.


Daniel Barth-Jones, PHD, MPH, is an infectious disease epidemiologist who specializes in computer simulation of the transmission and public health control of HIV and other infectious disease epidemics. His primary research interests include the epidemiology of HIV and sexually transmitted diseases, theoretical population vaccinology, Phase III HIV vaccine trial design, and health economic evaluations of public health policies for vaccination and preventative intervention programs. His research on HIV vaccine modeling and HIV vaccination strategy/policy development has been sponsored by the U.S. Centers for Disease Control (CDC), the International AIDS Vaccine Initiative (IAVI), the Joint United Nations Program on HIV/AIDS (UNAIDS), and the World Health Organization (WHO). Dr. Barth-Jones has conducted research in collaboration with the Ministries of Health in China, Brazil, Peru, Kenya, and Thailand, and he has been a frequent scientific advisor to WHO, UNAIDS, and IAVI. Dr. Barth-Jones is also a nationally recognized expert in the area of statistical disclosure analysis and control, where his work focuses on the development of statistical and geospatial disclosure control methodologies to help assure the confidentiality and privacy of healthcare data in compliance with the HIPAA Privacy Rule. He has given scientific presentations and conducted educational training on HIPAA Privacy regulations to numerous healthcare information organizations, healthcare delivery organizations, state and federal agencies and organizations, and within academia.

Felix Wu’s doctorate studies in computer science are foundational to his information law scholarship, which spans freedom of speech, privacy law, and intellectual property law. He has previously written on the limits of online intermediary immunity and on understanding the role of data de-identification in law. His current work explores the relationship between data privacy and theories of free expression.

Professor Wu was previously an associate at Covington & Burling in San Francisco. In 2006-7, he clerked for Judge Sandra L. Lynch of the United States Court of Appeals for the First Circuit. Immediately prior to coming to Cardozo, he was an intellectual property associate at Fish & Richardson in Boston.

Professor Wu received his undergraduate degree in 1996 in computer science summa cum laude from Harvard. He is a member of the Order of the Coif and Phi Beta Kappa.

About Databites
Data & Society’s “Databites” speaker series presents timely conversations about the purpose and power of technology, bridging our interdisciplinary research with broader public conversations about the societal implications of data and automation.