Expect the unexpected: Telling stories with open data


In our 5th edition of Conversations with data, we asked about telling stories with open data. Pinar Dağ shared the following case studies from her work in Turkey.

The Open Database of Deceased Workers in Turkey

After the tragic Soma mine disaster in 2014, it proved extremely difficult to document the working conditions of employees. There were discrepancies with the number of unionised workers, and an inability to provide sufficiently transparent data to account for the deaths of workers over previous decades. What was available was disorganised and lacking in detail to be able to pinpoint project data more effectively.

We wanted to open up this data, and shed light on the deaths of workers in other sectors.

With this in mind, we developed the Open Database of Deceased Workers in Turkey. It was a public project that verified data from multiple sources, presented it in different formats, and was open to access and use by everyone.

In Turkey, at least 130 workers die per month for several reasons. The most important goal of the project was to raise awareness of these deaths, and their frequency, as well as providing recognition to the victims and the poor working conditions that they endured.

The project comprised of embeddable maps, graphs, and data in different formats. It covered the deaths of workers in over 20 sectors from 2011 to 2014. After the project was completed, we continued to report the death of workers through regular media monitoring each month. Importantly, our data includes the name of the company.

Image: Our dataset of company names, based on media monitoring, as of 2017.

The project began back in 2015. We started by lodging Freedom of Information (FOI) requests and collecting data from some trusted NGOs, who extract data from different sources and open it up for anyone to use.

The first challenge: it wasn’t easy to get the open data via FOI. Sometimes it took two weeks, other times it would take four months.

But then a more unexpected challenge happened.

When we announced the project, one of the NGOs whose data we were using was unhappy that we were using their data. They claimed that the project did not say anything other than to use their open data. By using their data in this way, they thought that we had neglected their labor. We had asked for permission, and their data is also open to anyone, but they still opposed the project. They even stopped sending out their data bulletin as regularly as they used to! Imagine. Our project was able to get their data out more easily, in a visually understandable and downloadable format, they didn’t like us using it. Interestingly, they thought we pornographicised workers' deaths with the below visualisation and numbered filtering!

Of course, the human stories were always important, but unstructured, raw data was also important to present and structure for the public interest. Unfortunately, it was not possible to convince them of this logic.

Looking back, I think the most important point here was that they did not know the importance of the open data ecosystem. On the surface it might not seem like a big challenge, but without a local understanding of the value of open data, we will always face such difficulties.

We ended up uploading the monthly worker deaths by comparing state open data (remember those FOI requests) with the list that we collected through our own monitoring (see the spreadsheet above), and verifying these monthly.

Ultimately, the project was a success, recognised as a finalist in the Data Journalism Awards 2015. Unstructured data “kills life” but open data saves lives!

Data on missing children

In April 2014, the disappearance of a child in Istanbul captured the public’s attention. On a daily basis, hundreds of children go missing all around Turkey, however, this particular case received substantially more media attention, for reasons which were unclear.

Therefore, we wanted to search and investigate data on these other missing children. But, again, we were faced with a challenge: there were three different sets of data from three separate sources during that time. According to the Turkey Statistical Institute (TSI), there were 27,000 children missing. The Ministry of the Interior said 15,000. While the Lost Relatives Families Association (YAKAD) was reporting 30,000.

What did we do about this?

We produced news articles highlighting the discrepancies between these three sources. Since there was already high interest in the missing Istanbul child case, this brought awareness to the fact that there were problems with the data for this very sensitive and important issue.

Image: News story, in Turkish here.

What has happened since?

Back in 2014, we made two FOI requests to the Ministry of the Interior, however, we didn’t receive any answers. We sent the questions again, but never received a response. So, we started to collect our own data using alert systems (like Google Alerts, .(JavaScript must be enabled to view this email address)) to compare search engine results with official sources. We also collected data by scanning the media, scraping data (data miner, import.io) and following lost ads. However, this was both tiring and intensive work, so we stopped in late 2016.

This brings me to my advice: it is always important to create an alternative open data source.

That said, you cannot collect data for every project you do. For this reason, it is important that journalists share their data, and set an open standard for all use.

Image: Poster on missing children, in Turkish here.

Other recent examples

Last year, I experienced similar problems when working on this story. Data was scattered, sources were difficult to verify, official institutions were not able to provide full data, and there was no license standard. But despite all this, we were able to complete the project by calling all of our sources, asking for their ideas, and reading many resource books.

Image: Screenshot from the final project, live version here.

And now, I am also experiencing the same problems with a project on violence against animals in Turkey. Even when I look to data from institutions working on animal rights - even if the data is collected by them - it isn’t even available in a PDF! The only way you can get the data is by scraping from HTML pages. So, it is not an open data standard at all. It might be great to get raw data but it's hard even to find it. However, I’m still preservering with the project, and collecting data in whichever way I can.

To overcome these challenges, we are working hard to develop this area here in Turkey. We’ve held the first local Open Data Conference, we participate in the OKFN Open Data Index, and translate Open Barometer reports annually. Recently, we translated the Open Data Handbook, and we’re developing a training project to promote open data advocacy within local governments.

While opening data may not be not romantic, it’s important to show data custodians that it can give us a fairly realistic future. For this reason, open data advocacy, and motivating governments in this direction should be promoted by the media, academia and non-governmental organisations.

Some other general advice to overcome open data challenges

  • Whether you are an instructor in public administration, communication, new media, engineering, or science, you should add open data into your curriculum for future development. I absolutely explain the importance of open data in all my data journalism teachings.
  • If your open data ecosystem like ours (in Turkey) is not yet developed, your priority should be to communicate with the right institutions. First of all, you should definitely advocate the benefits of open data to public institutions and government officials.
  • If you are a non-profit organisation, you should explain the value of open data with local examples for your region. You should work on open data standards, and develop a bibliography. For instance, we've prepared the first open data training in Turkey.
  • Help other media use open data. I prepared a file on how open data projects are developed so that the media can use open data for their work too.I update it regularly, and it’s open for anyone to access. I do also help when they come with questions.
  • Develop resources, make local projects, hold local open data conferences, and open data trainings. They’re crucial to understanding problems, and providing solutions so that you can improve open data globally and locally.

For more musings on open data, check out the 5th edition of Conversations with data here. You can subscribe here.