2/11/2017

Managing the ethical risk of prediction in human services: “First, do no harm”

 

Public agencies are charged with making some of the most consequential decisions affecting our lives and communities. They determine who qualifies for assistance in the case of a crisis, when to investigate reports of abuse and neglect, and whether to jail or parole. And, increasingly, the discretion of the judges and social workers who make these decisions is informed not only by their experience but by computers – computers drawing on more data and more complex models than are available to any one person, and that purport to do a better job assessing risk than any unaided public servant.

Critics of applying these data science techniques to public services rightly worry that shoddy implementation of these predictive tools could worsen biases that already exist within these public systems and within the data they collect. (See, for example, Pro Publica’s excellent series on “machine bias.”) On the other hand, these same tools can produce remarkably fairer outcomes for vulnerable populations that regularly interact with social services and the justice system. For example, by replacing its cash bail system with a pre-trial risk assessment, the State of New Jersey reduced its jail population 19% in six months without any corresponding increase in crime and sharply reduced the harm to arrestees who were formerly held months in prison for the “crime” of not being able to afford bail. 

As the authors of a new report from the US-based MetroLab Network note, the more urgent debate here may not be whether predictive analysis of social services decisions is desirable but under what conditions it is useful and safe. Predictive models are already widely used in the justice and education sector and they are moving steadily into the human services sector despite the misgivings many share about the potential for their naïve use to cause mischief. The major reason for this expansion is likely the simplest: this kind of predictive modeling works. Governments with constrained resources are eager customers for tools to help them make better decisions, faster, and get services to where they’re most needed – and predictive analysis is exactly that sort of tool. 

What requirements and constraints, then, are necessary to ensure human service agencies building or purchasing these tools do the most good and least harm? What should we all know about risk modeling, what outcomes should we watch out for? And what is the oversight role for public leaders, public advocates, and journalists?  

Image: Four Principles for Ethically Applying Predicative Tools within Human Services Checklist, which can be found in MetroLab’s full-length report. Courtesy of MetroLab Network.

MetroLab’s report outlines four broad principles to guide the implementation (or investigation) of predictive risk models in human services, with a checklist of strategies for interrogating these tools for error, bias, or misuse. Underlying many of the report’s recommendations are a caution against turning over the development and vetting of these tools to “data people” or commercial vendors, and to be skeptical of easy assurances that a model is neutral, race-blind, and fair: 

  • Agencies cannot design and implement a predictive risk model without actively soliciting the contributions of both agency and community stakeholders. These leaders and their data team do not, by themselves, have sufficient context to root out the potential for bias and possible misuse of these tools, and they will rely on vigilance of these community partners to act as an additional “fail safe”.
  • There is no such thing as a “validated tool” or model —only validated applications of a tool. Every agency’s clients, departmental policies, and record-keeping are different in ways that fundamentally affect the reliability and fairness of a model’s risk score. It follows that agencies must not rely on the promises of commercial vendors or validation studies from other jurisdictions to substitute for their own local due-diligence. (And this is all the more important for more complex models, which tend to me more sensitive to underlying differences.)  
  • Protected variables like race, gender, and disability should not ever be eliminated from the dataset – and it is occasionally acceptable to use them as predictive factors. Client characteristic like race, for example, are very likely to be "rediscovered" by predictive models because of their close association with other variables like home address (given patterns of residential segregation in the United States). Retaining them in the data allows data scientists to test explicitly how a model assigns risk to different populations. In some carefully vetted cases, including a variable like race as a predictive factor can increase the likelihood that at-risk clients receive preventative care.

Perhaps most importantly to journalists, the report addresses the concern that an expansion of predictive analytics within the public sector means that citizens will lose effective oversight of some of the most important decisions government makes about their welfare. What is an effective definition of “transparency” in a field where, increasingly, machine learning techniques are creating predictive algorithms that even their creators cannot easily explain? 

  • Focusing on algorithmic transparency is insufficient. While it may be desirable for forensic or other reasons, access to source code demonstrates nothing about the most likely ways predictive models are likely to worsen outcomes: through bad data and bad implementation. A more useful definition of transparency would demand information from government on how predictive tools are developed and used, what measures are reviewed to ensure their operation remain accurate and fair, and what policy structures are created to provide the public with ongoing oversight. 

Work by Professors Robert Brauneis and Ellen P. Goodman, cited in the piece, grounds this claim with an extended list of elements that public agencies and their partners should ask to be disclosed – the result of several years the authors spent reviewing public contracts for predictive models collected through Freedom of Information Act (FOIA) requests.

These issues – of the definition of the transparency, the need for explainable machine learning algorithms, statistical approaches to measuring the fairness of predictive models – are under a tremendous amount of debate. Many of the organizations that advised MetroLab’s report are actively engaged with public sector leaders to create practical solutions, so that the use of predictive analytics in human services does not get out in front of their ethical “guard rails.” MetroLab welcomes feedback on the report, which you can submit by emailing .(JavaScript must be enabled to view this email address)

About MetroLab Network

MetroLab Network introduces a new model for bringing data, analytics, and innovation to local government: a network of institutionalized, cross-disciplinary partnerships between cities/counties and their universities. Its membership includes more than 35 such partnerships in the United States, ranging from mid-size cities to global metropolises. These city-university partnerships focus on research, development, and deployment of projects that offer technologically- and analytically-based solutions to challenges facing urban areas. In 2017, MetroLab launched its Data Science and Human Services Lab, as an effort to bring together academics, city and county practitioners, and non-profit leaders to consider the issues at the intersection of technology, analytics solutions, and human services deployment. MetroLab was launched as part of the White House’s 2015 Smart Cities Initiative.

Read the full report here.

Image: PearlsofJannah.

Comments