30,000 ways to tell local stories with data


What kind of stories would you produce if you had €706,000? For the Press Association and Urbs Media the answer is easy.

Having just being awarded €706,000 by Google's Digital News Initiative Innovation Fund, the pair are aiming to provide up to 30,000 data driven stories each month for hundreds of local media outlets through a new service called Reporters And Data And Robots, or RADAR.

RADAR is intended to meet the increasing demand for consistent, fact-based insights into local communities, and will derived its stories from open data sets.

We spoke to Alan Renwick, the Chief Executive Officer at Urbs Media, to find out more.

DDJ: What problem is RADAR trying to solve?

AR: The huge increase in open data provides a potential boost for local media’s efforts to hold local government, health & education authorities, police forces, et cetera, to account.  However few individual local news outlets have the resources to access, analyse and interpret this ocean of data. RADAR aims to solve this problem by taking a scale approach – that is, do it once on everyone’s behalf.

What makes local journalism a good fit for data driven storytelling?

A significant chunk of UK open data has localised breakdowns – by local authority area, NHS trusts, et cetera.  By investigating these data releases centrally, then using software to reversion for each geography covered, it is possible to give every newsroom a daily diet of stories to publish or develop at a local level.  The UK’s traditional local news industry supports 1,700 online brands, and there are many more independent and hyperlocal publishers.  It would not be economically feasible for them to duplicate this effort.

RADAR is to create 30,000 localised stories each month from open data sets. How will the service handle such a large volume? What challenges do you foresee in meeting this target and how do you intend to overcome these?

This target is based on the current flow of interesting and useable datasets, multiplied by the number of local areas it covers. For example, in a recent month we found about 150 good datasets, with an average of 200 local versions. The RADAR project has been designed to build an editorial team able to produce this sort of volume, with an integration into PA’s systems which will allow these stories to be distributed to the right outlets. Of course this is uncharted territory, so our plan is to gradually introduce volume to ensure that the quality of story produced and distributed is high.

RADAR will use Natural Language Generation (NLG) software to produce multiple versions of stories. What is NLG and why was it chosen?

The form of NLG we use is essentially an editorial template.  Our reporters write a story based on the datasheet they are working with.  They then use the NLG template to imagine the different stories that could be told for each data outcome.  We then run the data through the template and the NLG outputs tailored versions for each local area. We use NLG because it retains the journalistic control of the process – our reporters are finding and writing the stories; the NLG does the production.  

Find out more about RADAR here.

Image: John Loach.