Presenting PERVADE: Exploring data ethics for researchers and analysts


By Katie Shilton & Mary Kendig, College of Information Studies, University of Maryland, College Park.

Social media, networked information technologies, and wearable devices have increased the flow of rich, but often personal, information about people. Because this information is created by devices and actions embedded throughout people’s daily routines, we call it pervasive data. The increase in availability of pervasive data has resulted in new forms of research and analysis in fields as diverse as computer science, journalism, and marketing. 

Researchers who analyze data gathered as a byproduct of people’s digital actions and routines face ethical challenges as they weigh concerns about consent, privacy and surveillance, risk, and fairness. Some pervasive data research is met with public outcry, while other uses of pervasive data are lauded. What is behind these differences? There’s not a lot of existing guidance – or even informal consensus – for researchers and analysts who wish to use pervasive data in ways recognized as ethical or just. Researchers are faced with questions such as:

  • What are the real risks to individuals and groups in the use of pervasive data?
  • How do people experience the reuse of their personal data?
  • What factors impact people’s willingness to contribute pervasive data to research?
  • What steps should be taken to protect the autonomy and safety of individuals?
  • What do researchers in private and academic sectors do differently?
  • How are regulators adjusting to the new burdens they face in governing data research?
  • How are other researchers and analysts—many with different methods, training, standards, and approaches—addressing these questions?

A six-institution team, funded by the U.S. National Science Foundation, has come together to answer these questions. The PERVADE (Pervasive Data Ethics for Computational Research) team will use computational methods and modeling, interviews, content analysis, and surveys to model user concerns and disseminate evidence-based best practices and decision support tools for data ethics. The PERVADE team brings together:

  • Dr. Katie Shilton - College of Information Studies at the University of Maryland College Park
  • Dr. Jessica Vitak - College of Information Studies at the University of Maryland College Park
  • Dr. Matthew Bietz - Department of Informatics at the University of California, Irvine
  • Dr. Casey Fiesler - Department of Information Science, College of Media, Communication and Information at University of Colorado Boulder
  • Dr. Jacob Metcalf - Data & Society Research Institute
  • Dr. Arvind Narayanan - Department of Computer Science at Princeton University
  • Dr. Michael Zimmer - School of Information Studies at the University of Wisconsin-Milwaukee

While many of the project’s impacts will focus on research policy, data driven journalism can also benefit from its discoveries. Journalists are increasingly users of pervasive data. Gathering data for stories from Twitter or Facebook profiles, or using data from road cameras and traffic violations to report a story raise similar issues as research uses of pervasive data. Like big data researchers, journalists must balance investigation and discovery with minimizing harm. When these principles come into tension, an understanding of ethical challenges and norms can help journalists make choices about how and when to utilize pervasive data.

Journalism also provides a data source for answering the project’s research questions. Social computing research has become an increasingly mainstream topic, and public reaction to big data studies provides evidence into how people feel about such research. One PERVADE project will analyze news articles and comments related to social computing to understand public rhetoric surrounding data science. A better understanding of how pervasive data research is presented and perceived can suggest best practices for both conducting big data research and presenting its findings.

As the project progresses, its findings will be shared as best practices, decision support tools, and public educational materials.

Follow project progress and findings here and here.

Image: Stephen L Harlow.