RegioHack: data from the region


Martine Rouweler recently visited RegioHack, a 30 hour hack event in Enschede, The Netherlands, on 10 and 11 November, organised by regional newspapers De Stentor and TC Tubantia, and Saxion school of higher education. (**Please note: All links lead to websites that are in Dutch.)

Since several national and international hacking events have already taken place, it was only a matter of time before someone would do the same on a regional level. Jerry Vermanen, Editor of the Dutch regional newspaper, De Stentor, had the idea to organise a regional hack event, which is an event where developers, designers and other people with ideas come together to explore new fields, in this case data journalism. The event was called RegioHack and it was something that Vermanen decided to put together after attending the Hacking Journalism event in Utrecht. His conclusion after coming across data journalism a few times and attending several events is that data is going to be increasingly published. If journalists want to keep their role of being watchdogs to the actions of governments and businesses, which hits particularly close to home with regional news, then data journalism is the way to go. According to Vermanen, “the trend of putting data online isn't going to be reversed, so yes, data journalism is going to stay and probably be integrated into regular investigative journalism.”

RegioHack is a cooperation between regional Dutch newspapers De Stentor and TC Tubantia and the Saxion school of higher education. The plan was to put journalists, who had relatively little experience with searching the web for data and transforming it for their use, together with programmers who had plenty of experience with script, scraping and visualising data but not journalism. The goal was to give six different teams 30 hours to explore and try out data journalism by giving them the tools they might need in order to produce some news stories. “Many journalists see data in front of them and get scared or hopeless when it is bigger than a few lines in a table.” said Vermanen. “Data journalism is about datasets that are bigger than that and we want journalists to get a taste of what it is to work with such data. The next time they see data, they won't fear it quite as much.” 

Because RegioHack participants were new to data journalism and the organisers did not want to scare them away from data journalism forever, participants were provided with their own groups and themes, along with some angles and data-filled websites before the event got started. Instruction videos from a previous workshop and online data analysis tools were also made available.

Themes at the event ranged from health and medicine use, to criminality and traffic. These themes encouraged participants to come up with questions, find datasets that might answer their questions and brainstorm ways to represent their findings in a news article.

On the first day of RegioHack, in one of the Saxion media rooms, several different groups were working diligently on their projects, while at the same time excitedly interacting with one another. The atmosphere was relaxed and met the expectations of Vermanen.

Freke Remmers, a RegioHack participant and self-professed novice in the field of journalism, explained how her group started that morning on the topic of risk and safety and how they attempted to find data. The group thought they had found an immediate data jackpot because they could see discrepancies in the data on businesses that pose a potential health risk to the people living close by. Further research showed however, that the data was correct, unfortunately debunking their story.

Other participants gave similar stories of a successful start but after nine hours of searching, fatigue and despair in not finding what they needed set in. Luckily, food finally arrived in the shape of pizza and the projects were temporarily put aside enabling the relaxed atmosphere to appear once again.

The work did not stop with dinner though. Heinze Havinga, a fellow organiser with Vermanen, was one of the people who stayed up working until 2:30 AM to work on a script that would scrape information from LinkedIn. Eventually Havinga was able to get some sleep, but the next morning at 9 AM everyone was back at work.

While day one illustrated some difficulties, day two showed much better prospects. Josien Kodde of the TC Tubantia had explained the difficulties of finding the data on the first day. Trying to find data on diabetes in the region of Twente and the Achterhoek, the areas covered by the TC Tubantia and De Stentor respectively, led to problems with other institutes giving slightly different types of data per region. The quest for information seemed hopeless, but the next day showed a completely different set of participants. Kodde's group had started with a division of labour and they were all confident that they now had something. Their programmer was even able to visualise some of their findings, even though some of the requested data was not in yet by the end of the event. In the end all six groups had data to present.                                                                

Freke Remmers’ group started with safety and risk but ended up working on EU stimulation subsidies in the different Dutch regions and who gets what.

By the end of the 30 hours their project had potential but was not yet to the point of publication. Josien Kodde’s group were a bit further along in the process as they were trying to determine if the region was prepared for the rise in diabetes patients in the next few years. A definitive answer on a sensitive subject is difficult to obtain, though the message they wanted to publish was clear: 'All hands on deck for the silent assassin.'

The group who worked on criminality also obtained an interesting result. They looked into two different tracks, both the registration of violence through camera surveillance and the use of Burgernet, which is set up to let citizens help in creating safety in their neighbourhood. The results of both these tracks were that there is hardly any data available about their effectiveness, which is especially damning for Burgernet, since the project costs millions.

Not everyone took the same route in finding and publishing their data. The group that was working with money, power and influence decided to find the most powerful person in the region and in the end decided to build their own interactive database. Heinze Havinga's script to search LinkedIn looked for people who had several influential positions on boards and committees. In the group’s final presentation over 200 people were represented, but they were nowhere near finished. The group was so enthusiastic however, that they decided to continue the project and hope to publish it and keep it up to date in the future.

In the end, the recognition of the need for 'tech savvy nerds' was realised. Participants left with what many called, ‘an incredible learning experience’ and were convinced of including data journalism into their skills set. When asked if he considered the event a success Vermanen replied with, “we had hoped for a 50% success rate, or that at least 3 groups would find something. But we got so much more.”


The RegioHack topics handout.
After 9 hours of work on the first day.

Heinze Havinga working on the LinkedIn script.
Josien Kodde and her group presenting their findings.