Local datavores meet hyperlocal journalism


Powerful big data analytical tools, Internet of Things technologies, sensors and new methods of collecting data tend to attract much of the attention about the ways data can transform councils. But new Nesta research indicates that councils are starting to find significant impact simply through better data analysis, often of data they have held for years. Termed ‘local datavores’, data driven local authorities are finding that there is much to gain from small data as there is from big data. For journalists reporting on local issues, these data driven opportunities can provide a valuable resource of hyperlocal information.

One of the most promising areas of data science for local issues identified by Nesta is the application of machine learning and predictive algorithms. In local government, applications of predictive analytics provide an ability to understand the likelihood of future events with far greater accuracy, and to find patterns in existing data sets with greater sophistication. Although Nesta focused on use cases geared towards policymaking, predictive analytics can just as easily be capitalized on by journalists investigate impact of local programs and policies.

Predictive analytics, such as clustering and classification, provide a robust means of understanding the relationship between government policy decisions or interventions and future outcomes, thereby enabling resources to be allocated more efficiently. Predictive analytics can also help to more accurately identify potentially adverse events, and the possible effectiveness of available interventions. This is one of the big missing pieces in the prevention jigsaw, and there are some programmes now in operation around the world that suggest this could achieve significant value for local government.

For example, in New Zealand, the University of Auckland has developed a predictive risk model which assesses the likelihood of a child being abused in future. The predictive risk model is now being tested on a much larger data set before it is rolled out for practical use.


Image: The University of Auckland’s risk scoring for a maltreatment finding by age 5.

Similar models have been developed across the US, and in the UK, councils such as Bristol, Westminster and Manchester are developing or trialling forms of predictive analytics in children’s social care as a means of targeting the early provision of support services.

Predictive analytics have also been used for regulation and disaster management – approaches that could be reappropriated by data journalists reporting on emergency. New York, for instance, the Mayor’s Office for Data Analytics has combined multiple data sets covering over 60 risk factors to predict which buildings are most likely to have a fire.


Image: Location of fires as predicted before and after the use of the model.

Challenges of this approach

Despite the clear potential benefits of these approaches, this is an area in which important ethical questions must be put at the forefront of the debate. ‘Predictive policing’ in the US has exposed some of the most problematic downsides of machine learning applications in public services. Cities such as Fresno in the US are now using algorithms to identify potential ‘criminals’ before they have committed a crime. Using Internet of Things technology, social media data and existing police data, and citizens’ records, programmes such as Beware are able to scan streets or areas for potential threats. Individuals deemed likely to commit a crime can be identified, and police notified to take pre-emptive action through issuing warnings.

One of the key challenges is that machine learning can entrench existing prejudices or biases into computer code. For instance, predictive policing models have been argued to be akin to racial profiling, disproportionately targeting ethnic minorities for crimes they haven’t committed. In addition, the code that underpins the analysis is often not open or transparent, making it hard to scrutinise the assumptions that lie within it.

With this in mind, Nesta has been undertaking work on the responsible use of machine learning in public services, and has stressed the importance of developing develop transparent ethical frameworks to oversee the use of machine learning and algorithms in public services.

For journalists, therefore, the use of predictive analytics represents not only an opportunity for their own reportage, but also a consideration in their reporting on the activities of local datavores. Like ProPublica’s recent machine bias expose, hyperlocal journalists must similarly scrutinise how data is applied by local governments and the ensuing impacts on communities.

This piece includes extracts from Nesta's report 'Datavores of Local Government (CC BY-NC-SA 4.0). Read the full report here.

Image: iamdanw.