A Data Journalist’s Life: Interview with Sarah Cohen


Sarah Cohen is an acclaimed American journalist who has worked with data for much of her career. She was a database reporter and editor at the Washington Post for 10 years before moving to Duke University, where she founded the Reporters' Lab and held the Knight chair in the Stanford School of Public Policy. Now a reporter at The New York Times, Sarah spoke to us about her experiences and views on the role of data in journalism.


When was the moment you realized that working with data would play an important role in your career?

Cohen: It was really on one of my first jobs. I started asking the state officials about information on consumers’ problems in Florida and they kept saying they could not give it to me because they didn’t know the answer and they didn’t know how to get the information I needed.

I was lucky enough to work in a state where consumers could file a complaint through a form, which was then associated to a tracking number and included in a database. This allowed me to have an overview of what the database contained and I knew that if they would just give me the whole data the answer to my question would be easy to find. So I knew exactly what I wanted to do and it was really just a process of learning how to get to this information.

Government is really good at collecting things and very bad at using what it collects. The state of Florida incorporated in its constitution the concept of open data much earlier than other states, so a lot of people coming from the state of Florida learnt this culture much earlier than others.

Was there any moment you thought working with data was not going anywhere and you were on the verge of giving up?

Cohen: Oh yes! With every story you do using any source you are ready to throw it all out the window at any point. There were stories that we never could do. I was lucky to work most of my career in a place [The Washington Post] that did not make me do just data journalism, and I don’t mean “just” because it is a huge skill. They let us work with the whole range of tasks.

When I worked on a project we would start it as a data journalism project, but sometimes it would be impossible to complete it as such, because there was no way to do it the way we had hoped.  So we would just turn to the regular street reporting and writing. You just do what you can: sometimes projects can’t be developed the way you wanted originally so you just turn to the regular story.

Can data journalism lead to an enlightenment of civil society?

Cohen: That’s a big question…the point of journalism is to support democracy, which is what our cultures are all about. Especially in America, the whole idea of the freedom of the press is about an informed society.

So if data journalism can help us get people more engaged, can help us get to important information that they might otherwise not have and can help us present things in a way that will enlighten them more (and hopefully could have some transfer into democracies) that would be great. But I don’t know if anyone has proven it yet…

Can data journalism bridge the gap between data overload and audiences?

Cohen: I think that’s where the journalistic part of data journalism comes into play. We always had a million people we could interview and we had to choose some. That’s where I think you can’t divorce the “data” part from “journalism”.

In the past it was very difficult to present stories online and we needed specialists to do it, just as we needed data experts to handle and interpret data. Now the two skills are coming back together in the same professional figure, and that’s a positive development. So you, new aspiring data journalists, are coming at a good time. Let us do all the mistakes and learn from our experience!

How do you evaluate feedback from the audience, is there a way to know your readers actually understood the message of your story?

Cohen: In my experience, sometimes the biggest feedback would come from the officials themselves, who did not know what was in their own records. My first (data journalism) story for the Washington Post was in 1988/1989: we got some information on how weapons moved through systems of police back on to the streets. We got information from every local police department, there might have been a dozen of them, and after we did the investigation each police department called us and said: “Can you give us the information on our weapons because we want to know what happened to them?” and then they learnt and changed their policies to prevent it from happening again.

So sometimes the reaction you get is less from the public at large, but rather from the officials after they learn what is inside their file cabinets. Now you could argue that you wouldn’t need to write a story about it and just give them their data back, but if you did so they wouldn’t care. Unless it ends up on the front page of a newspaper or on the web.

I remember one story where we went to the chief judge, over and over, to try to show him, on the base of data, what was happening in his courtroom, but he kept on saying to himself: “This is an isolated incident, this is not a big deal”. The day he saw the results of our investigation on the front page he changed the whole system. He said: “I just didn’t get it then”. 

So these are the kinds of rewarding things that can happen, of course we hope that our work changes the life of regular people. In Europe in general, the whole idea of investigative journalism is relatively new: you don’t have the same kind of culture of questioning the authorities the way we do in the U.S. It is a big change to make.

What mistakes or attitudes should beginners in data journalism avoid?

Cohen: You should pay attention to your gut feelings and critically question the data. The big risk is that, when you go back to the government with something they don’t know, they have to believe you.

The authorities respond to your information as fact, because they have no other way to disprove it. They might even try to rationalize the information in a way that makes sense, although reality might say the opposite. I learnt very early to believe them when they say “you are wrong”, and they are almost always not right about that, but I learnt to start with the assumption that I made a mistake. The biggest problem is that some people don’t learn enough about the process that goes into the data they are looking at. You can’t assume you know what a column or item means; you really have to know it. Otherwise your interpretation might drastically change reality.