The elusive case of police officer-involved homicides data


Now data seems like a dry and boring word, but without it, we cannot understand our world and make it better. How can we address concerns about use of force, how can we address concerns about officer involved shootings, if we do not have a reliable grasp of the demographics and the circumstances of those incidents? […] Without complete and accurate data, we are left with ideological thunderbolts, and that helps spark unrest and distrust and does not help us get better”  — James Coney, FBI Director, 12 February 2015

Newspaper headlines throughout 2014–2015 reported a spate of killings of unarmed African Americans by police officers around the United States. Incidents that took place in Ferguson, Missouri; Baltimore; New York; and Los Angeles, among others, ignited a frustrated public who took to the streets and launched social media campaigns to challenge the violence and racial profiling by police officials. Local and national administrations responded by implementing task forces and new policies around officer conduct. Yet even with new corrective measures in place, the disturbing trend of racially targeted violence exposed the difficulty of tallying the extent of these killings, thanks to a significant gap of data on the number of police officer-involved homicides (POIHs) across the US. Scholars of Criminal Justice, Public Health, and Public Policy have all documented these failings since the 1970s, but the public outcry throughout 2014–2015 has exposed the data’s alarming incompleteness to a wider public. As FiveThirtyEight reporter Reuben Fischer-Baum notes, the FBI’s Supplementary Homicide Report (SHR)—the federal database of police shootings generally referenced by news reports—is an inadequate accounting of such statistics because it only publishes police homicides declared ‘justified’. Furthermore, law enforcement agencies are not mandated to report these deaths, leading to a reporting rate of less than a third. When states do report, the data could be differently collected across states and local jurisdictions.

Such elisions in official data come at a time of so-called data deluge as we increasingly turn to data as a mechanism for solving societal problems. This impulse was on display in one of President Barack Obama’s first Memorandums in office. The Transparency and Open Government initiative in 2009 committed to “unprecedented levels of openness,” most visibly through a website of federal databases, data.gov. The website’s thousands of executive agency datasets are available without fees and with minimal licensing restrictions; they provide a window into government processes such as budgeting, environmental oversight, and scientific research. Providing a rich set of resources for research and technological innovation, the website also promises greater insight into government procedures.

The lack of data on POIHs, however, reveals that there are ongoing gaps in the government’s transparency efforts. Despite the enormous apparati our government invests in other types of data collection, data.gov currently contains no downloadable national database of POIH data and only links to a website maintained by the Department of Justice, where the data is not easily accessible or downloadable. While Obama’s Police Data Initiative is a recent step towards remedying this situation, official information on such killings remains fragmentary and difficult to find. National-level data are overall interested in measures of accountability, yet the ellipses in these datasets seem primarily to result from the lack of a data assemblage that would support the consistent collection and recording of data, as well as the dissemination of the data that is collected, even though these large organizations would ostensibly have the resources and labor power to oversee efficient data production.

Some of the best-kept statistics on national POIH are not government-based but collected by activist groups and newspapers. Two of the largest, KilledByPolice.net and Fatal Encounters, are civilian efforts. Operation Ghetto Storm, published by the Malcolm X grassroots committee, released a 2012 report using statistical information from local police departments on police killings of African Americans in the U.S. The Center for Policing Equity at UCLA similarly collects and analyzes information on police-civilian encounters, studying racial profiling as one of four primary areas of concern. Recently, both The Guardian and The Washington Post have also established their own national counts on POIH in the U.S.


Image: The Guardian’s POIH project, ‘The Counted’.

Case study: POIH databases of Los Angeles County

The first of our datasets is the FBI's Supplementary Homicide Report (SHR). The SHR, the most frequently cited among the federal datasets, was begun in 1962 as part of the more extensive Uniform Crime Reporting (UCR) database that the FBI has maintained for 85 years. While the older UCR provides annual counts of all recorded homicides in aggregate numbers, the SHR supplements the UCR with granular details that provide some context of the event, particularly the victims’ relation to offenders. These details are manually recorded by local law enforcement agencies on a voluntary form; how the form is filled out might vary, and data fields are treated as optional. Once completed, the form is then compiled and coded either by the FBI or by state-reporting agencies to produce the statistical data for all counties in the U.S. that report it. By recording information into the form’s column labeled “circumstances,” the SHR allows agencies to report data on justifiable homicides by law enforcement—coded as “Felon Killed by Police Officer” (code 81). The FBI offers no evidence as to whether it provides additional oversight over the accuracy of the forms.

As is clear from ongoing criticisms, the SHR has very weak institutional, legal, and financial ensembles of support. Because the report is not legally mandated, many states decline to participate. In data released on the SHR in 2003, 18 states have opted out from reporting on this classification during certain years, with Washington D.C., Montana, and Nebraska opting out of reporting at least 12 years and Florida opting out entirely. Even if a form is submitted, data entry is often incomplete—law enforcement reporting a homicide will not always include demographic data, for instance. According to the Guardian, “In 2011, 31% of SHRs omitted the offender’s sex, age and race. When the victim was a black male, basic identifying data on the offender was omitted more often, 39.9% of the time”.

SHR’s decentralized, bottom-up approach also creates problems with consistency: data gathered from local sources confront variable software and media to make the recordings, differences that are hidden in the aggregate. As a White House press release reported, Camden PD “cobbles together 41 systems that have individual value, but are not designed to work together, requiring their beat officers to enter the same data multiple times”. Without standardization, the report cautioned, analysis of these sources may not be meaningful. Additionally, the SHR provides only information based on the initial police investigation, not on subsequent decisions made by prosecutors or courts.

Our second dataset, the National Vital Statistics System (NVSS), gathers reports that originate from death certificates by a coroner or medical examiner, as required by law in 36 states. In contrast to the voluntary SHR, the NVSS is mandatory. To be classified as a POIH, this form must certify manner-of-death as a homicide, then provide additional detail in an open text field that asks the coroner to “describe how the injury occurred”. Only if an officer is listed as a perpetrator in this description is the death coded, through the International Classification of Disease-10 codes, as “Death by legal intervention.” Problems of reliability crop up, however, because the instructions for completing the form do not explicitly indicate that police involvement be mentioned at all, while coroners may not even know if the deceased was involved in an attempted arrest at the time of death. Studies have shown the inadequacy of this data, with underreporting as high as 51% in some cases. The NVSS lack of guidelines for the death certification makes underreporting inevitable. Furthermore, unlike the SHR, the NVSS only provides aggregate data at the county level, obscuring demographic data at the level of each incident. So while NVSS captures the most detail, counting many aspects that the other datasets do not such as measures of victim marital status and educational attainment, it does not make this data public except in aggregate.

The third data set, the LA Times’ (LAT) comprehensive Homicide Report, gathers statistics and analysis on all deaths within Los Angeles County. The Report is a part of the LAT Data Desk; it uses, at a very minimum, police reports corroborated with the coroner’s reports, and it sometimes supplements these with investigative reporting on cases when money and time allow. The data for each homicide is displayed publicly online on a dynamic map, as well as in individual posts with description about each death. Each post is organized through statistical data capturing neighborhood in which the death occurred, gender, age, race and ethnicity, cause of death, and whether an officer was involved. The LAT is very interested in questions of access and, as such, their website makes information on these homicides easily accessible. Their interface combines quantitative numbers on police homicides with accompanying qualitative information found through their investigations. LAT has an FAQ on its website with information about how the Homicide Report data is collected and processed, and each individual post includes the contact information of the author for questions or concern from the public. The LAT’s data is browsable but not downloadable on its website; individuals can request the statistical data, which the LAT provides in the form of an Excel spreadsheet.


Image: The La Times’ Homicide Report.

SHR, NVSS, and LAT have concerns of much wider scope than the specifics of POIH; they are exhaustive statistical classification systems that attempt to capture an entire range of phenomena—all homicides or all deaths. Our fourth dataset, a report of the deceased collected by the Youth Justice Coalition (YJC), in contrast, devotes personnel to explicitly capture POIH data. The YJC is a community organization devoted to issues around incarceration, youth, and race. The organization’s report is a database that uses coroner’s reports corroborated with police reports, and in some cases makes it report based on interviews with the family of the deceased, as well as eye witnesses and community members in the area where the victim was killed. Accompanying demographic data (age, gender, race), data on the neighborhood and address where the homicide occurred, and date of death, the YJC in some cases also provides a photograph of the deceased and a short description for each incident of police homicide (for example, “Called to mental health facility; officers claimed they shot because Saucedo approached with ‘sharp object.”) Their website is not as widely known as the LAT Homicide Report, nor do they incorporate any sort of interactive elements to the display of their information, but they do make available for download the PDF that documents this information.

The two county-only datasets by LAT and YJC diverge from the federal datasets in at least four important ways. First, the county data offers more granularity than the federal accounts, with data points capturing the place where the individual was killed and the victim's name. Second, in contrast to the federal datasets, the local data often uses more than one source to verify and detail the incidents. The local datasets also introduce qualitative information at the level of each death, sometimes as a result of investigatory effort. Finally, the local statistics are dynamic. In controversial cases, both organizations follow and document legal proceedings that take place following the homicide and adjust accounts as new information comes to light, capturing contestations that can occur as a homicide are deemed justified or unjustified. The federal SHR, in contrast, maintains local police agencies as the arbiters of a death's justifiability. The numbers do not capture any subsequent legal procedures that might prove the contrary.

The local data, it should be acknowledged, is limited for research in that it does not scale up: the methods used by these sources are specific to local phenomenon and cannot be analyzed alongside local data from other counties that do not use the same collection methodologies. Yet we have found that the local data provide a counter-narrative to the strictly quantitative data found in the federal accounts that naturalizes these recordings as official facts. Local data is a rhetorical tool that shapes how we understand police homicides. As our further research indicates, counter-data action considering this smaller-scale local data with the larger federal data troves brings qualitative, interpretive analysis to bear on our understanding of POIH.

This article is an edited extract of a larger piece on counter-data (CC BY 3.0). Read the full report here.