How to become a data journalist: School of Data Journalism - Day 3


This article is cross-posted on the web magazine of the International Journalism Festival in Perugia.


As the EJC/OKF School of Data Journalism at the International Journalism Festival in Perugia is coming to an end, Steve Doig, computer-assisted reporting expert and Knight Chair in Journalism at the Walter Cronkite School of Journalism, and investigative journalist Caelainn Barr dived into techniques for getting stories from data.

The last but one workshop of this series organised by the European Journalism Centre and the Open Knowledge Foundation built in part on what Prof. Doig had covered in the workshop "Precision journalism." Caelainn Barr showcased her work on the E.U. structural funds investigation as a case study in order to put theory into practice. One of the essential lessons the participants to the festival will surely bring home is how crucial Microsoft Excel is as a tool for data journalism.

Prof. Doig and several other speakers of the data journalism panels and workshops stressed many times that everything starts with a simple Excel sheet. "Once you open the file one of the most basic things you can do is to look for outliers: check who is the most, who is the least, the best and the worst performer in your list. If the descriptions matching the items in your file don’t tell you much to understand what the data refers to, you will have to dig deeper with the authorities who provided the dataset," said Doig while demonstrating how to filter, sort and do basic calculations in an Excel file retrieved from ISTAT, the Italian National Institute for Statistics.

Prof. Doig teaching the audience the many functions of Excel

"There are stories everywhere in data. Even the simple fact of being denied data by a given body is a story in itself. If you look at the columns in the spreadsheet, that will tell you what kind of stories you can get out of it. The secret to finding stories in data is finding paths, which are often revealed by doing calculations," he added.

Caelainn Barr introduced her work on the E.U structural funds investigation, a project which took nine months to complete. An immense amount of work went into cleaning data, which came in all sorts of formats and sometimes, such as in Italy's case, made it "diabolical." "This comes from the fact that all E.U. member states are required to publish this information but the Commission did not establish a mandatory format for them to do so," explained the investigative journalist, who worked for the Bureau of Investigative Journalism during the project.

As she progressed through the E.U. analysis of structural funds recipients, Barr realised that "regional authorities had no clue how the money was being spent because they did not have a database storing this information. All these details were out there but it took ages to compare things." The database is now public and anyone can browse and search through it in order to find stories: "We encourage you to use the database. Absolutely get in contact if you have a new project you would like to work on. We had media and broadcast companies interested in stories generated from the database. For example, the BBC was interested in Italy and Spain, and Al Jazeera did a report about the Italian mafia and structural funds."

Once again the venue of the workshop at hotel SanGallo was full of data enthusiasts who engaged in an intense Q&A session towards the end of the workshop.


Investigative journalist Caelainn Barr

Prof. Doig concluded reminding the audience that: "The most important question to ask in any story is 'why'. It is the same with data journalism: once you identify a pattern, ask why that is. Ask the police, the people, the authorities and do the same kind of reporting you normally do in your journalistic profession. Data journalism is just another source, it is a step to the story, it is not a different way of thinking," or, to use the words of Pulitzer Prize winner Paige St. John "doing data journalism is not a way to stay away from the streets."

The last workshop of the Data Journalism School, Spending Stories, will take place on Sunday 29th April at 14:00 at Hotel SanGallo.