A vaccine against that anti-data journalism brain


There is something strange in that buzzword, “data journalism”. It reads like an oxymoron.

Data and journalism are traditionally not seen as friends – and often as foes. Journalists, suffering from a “blind spot” for numbers, tend to dismiss data and statistics altogether. For some, they are hard to swallow and fly in the face of what journalism is about. Quite a few see numeracy as “a kind of virus which, if caught, can damage the literary brain, leading to a permanent loss of vocabulary and shrivelling of sensitivity,” observes David Randall in The Universal Journalist. 

“Journalism is one of the few professions that not only tolerates general innumeracy, but celebrates it,” said Aron Pilhofer in a response to recent research. “I still hear journalists who are proud of it, even celebrating that they can’t do math, even though programming is about logic.” Meanwhile, numeracy is rarely or barely included as essential skills in journalism training and education.

No wonder the media often “get a bad press” when it comes to statistics – to the extent that some experts have come to assume journalists never get numbers right. Not entirely correct, but rarely challenged, probably because few would care and dare to challenge it.

And perhaps it is no wonder why the driving force of many data journalism projects today is not journalists but those with non-journalism skills and experiences – developers, designers and producers. Some influential names that have triggered the interest in data journalism, such as Nate Silver or Ezra Klein, were never journalists. This is despite the inspiring work on computational journalism pioneered by Philip Meyer in the 1960s.

So if the seemingly sudden rise of data journalism says anything to journalists, it is this: their traditional luxury of ignoring – even laughing at – statistics is no longer sustainable. Public datasets are now more available and accessible than ever. The ability to use simple computer codes to turn raw numbers into beautiful data maps or informative graphics is exciting and empowering. But all this would become meaningless if that deep-rooted anti-data attitude among journalists is not extinguished.

An increasingly chaotic world of “lies and damn lies”

Which leads us to a more general point: every journalist – with or without “data” in their job titles – has the duty to learn to deal with data. The recent hype around big data might distract us from the fact that statistics have long been part and parcel of the fabric of contemporary societies. From the quality of the air we breathe to the national leader we choose, almost every aspect of modern life is statistically measured and represented in one way or another. “If you live in Britain, there’s no such thing as a day untouched by the Office of National Statistics,” declared The Guardian in the caption of a recent graphic.

With that, sadly, comes an increasingly chaotic world. People tend to place more faith in numbers than words, often giving them authority at face value. But statistics, despite their objective look, are still a subjective human invention: they can be inappropriately produced or improperly used, for all sorts of benign or malicious reasons.

Major social, economic and political institutions very well understand the deep and powerful penetration of statistics into the way people think, believe and behave. Their increasingly complex, resourceful “news management” machines would flood newsrooms with all sorts of data that work to their advantage.

One result is a deluge of “bad statistics” out there. In some cases, they are data from projects where the “researchers” know the conclusion before they start. In others, it is about “massaging” data to advance some interests at the expense of the rigour of the data. In more shocking cases, the data arriving at the news desk do not exist.

The opposite – the urge to hide sensitive and unfavourable data – is also common. Among the tactics often used to “bury bad news” are releasing a huge amount of data all at once to make it hard for journalists to spot controversial numbers, or choosing to publish data at a time when journalists are too busy to treat them with sufficient attention and due diligence. Some powerful institutions would go far enough to create “flak” – i.e. making it expensive, intimidating and even hopeless for journalists to obtain data.

That is why journalists’ traditional “laissez-faire” approach to data and statistics – on the naïve and convenient basis that numbers speak for themselves – can no longer be afforded. Sometimes, uncritical reporting of numbers leads to sadly hilarious claims. But there are more serious consequences. Without a watchful eye, journalists might be easily lured away from vital figures that the public needs to know but sources want to conceal. They would not be able to detect, and much less to expose, misleading or cheating data of paramount importance to the public.

More commonly, they would serve as unwitting mouthpieces for powerful and resourceful sources. Very often, without a confidence to scrutinise and challenge data, journalists opt for the easier route: they adopt the superficial but safe “he said, she said” formula and highlight to audiences what is already highlighted by sources. Even in the emerging world of data journalism, although journalists have more control of raw datasets, most still don’t go further than relaying “friendly data” from powerful institutions.

The problem can be worse when journalists force-fit and shoe-horn numerical facts into their own personal perception and/or professional framing of the world. “Journalists tend to use statistics to reinforce their own views and pre-conceptions of reality,” Michael Blastland, co-founder of BBC Radio 4’s More or Less, told us a couple of years ago. “They take data that can fit in their own narrow scope of what the story should say. They have to fit the format of news stories that have a beginning and an end.”

American scholar, Kathleen Geier, is among those deeply concerned. It is, she notes on a Washington Monthly blog, “bad enough” to have ideologues to “deploy dubious studies and statistics directly” without being held accountable. It is worse, however, “when shoddy research and dubious factoids get respectful attention from mainstream reporters and pundits”. This, for her, happens all too often and, with the aid of journalists’ desire for sensationalism, “crap statistics” continue to bamboozle the public and damage the body politic. 

Examples abound to amplify Geier’s misgivings. One recent study by Muhammad Idrees Ahmad shows how a “docile press” fetishizes flawed and dubious body counts, projecting and sustaining the false public image of deadly US-led drone strikes in Pakistan as a surgical war with little collateral damage.

Or think of those poor crime data that have promoted an excessive fear of crime, created a state of moral panics, and pushed the public and elected leaders into unsound decisions. Or the many unnecessary worries, false hopes, meaningless lifestyle changes, wasteful medical spending and undue resistance to doctors’ advice that are generated by questionable medical statistics in the news. Or the “mum and dad” investors who lose entire life savings due to inaccurate, insensitive or naïve reporting on stock market data. Or the tension, conflict or bloodshed fuelled in part by poorly reported numbers. The list goes on.

A long-overdue vaccine

Does this mean journalists can check every statistic that comes their way? Realistically, we doubt that they can in the current climate. Journalists face more and more intense  newsroom pressures that result from a detrimental confluence of declining revenues, shrinking resources and increasing demands for 24/7 deadlines, multi-platform delivery, multiskilling tasks and so on.

It is, however, precisely this pressure that emphasizes the need for every journalist to possess a basic level of statistical competence. Instead of a virus damaging the literary brain, that competence should be seen at least as an effective vaccine that the brain has against the risk of falling victim to bad statistics and the data and people behind them.

And if data have a chance to become the future of journalism as Tim Berners-Lee and other thinkers predict, that future starts from this immunized brain. Without it, there would be no foundation stone for journalists to explore the many horizons and possibilities that data are opening. With it, journalists can go very far in their career, as already seen in numerous inspiring examples of ground-breaking and world-changing data-driven journalism such as the exposure of the British MP expense abuse by The Daily Telegraph, or the “offshore list” of multinational tax evaders by the International Consortium of Investigative Journalists.

Here it should be reminded that statistics are not mathematics. Handling numbers for the news is often wrongly perceived as using eye-numbing formulae to calculate things. They are two different beasts: you don’t need to be adept at mathematics to be able to use statistics effectively. However frightening it might look, statistical enquiry is about valid reasoning, not calculation.

What journalists need is a habit to question data in the same way as they do to any other kind of raw information. Where are the data from? Who actually did the research – and how did they do it? Who paid for it? Can the data be independently validated? And ultimately, does the data make sense in context? Many, perhaps most, data-generated myths and untruths can be easily busted after asking these basic journalistic questions.

It is time for news organizations, accreditation bodies, training firms and universities to get together to mass-produce that badly needed vaccine.

This commentary is based primarily on the authors’ introduction to a special issue (January 2016) of Journalism: Theory, Practice & Criticism on statistics in journalism and journalism education. A pre-print version can be found here.

Photo: PROr2hox