What Kinds of Stories Can You Find in Data?
Image by Flickr user ashley.adcox
I think it helps to bear this list below in mind, not only when you are analyzing data, but also at the stage before that, when you are collecting it (whether looking for publicly available datasets or compiling freedom of information requests).
The simplest story; counting or totaling something: “Local councils across the country spent a total of £x billion on paper clips last year.” But it’s often difficult to know if that’s a lot or a little. For that, you need context, which can be provided by:
“Last year local councils spent two-thirds of their stationery budget on paper clips.”
“Local councils spend more on paper clips than on providing meals-on-wheels for the elderly.”
“Council spending on paper clips last year was twice the nation’s overseas aid budget.”
There are also other ways of exploring the data in a contextual or comparative way:
Change over time
“Council spending on paper clips has trebled in the past four years.”
These are often geographical or by institution, and you must make sure the basis for comparison is fair (e.g., taking into account the size of the local population).
“Borsetshire Council spends more on paper clips for each member of staff than any other local authority, at a rate four times the national average.”
Or you can divide the data subjects into groups:
Analysis by categories
“Councils run by the Purple Party spend 50% more on paper clips than those controlled by the Yellow Party.”
Or you can relate factors numerically:
“Councils run by politicians who have received donations from stationery companies spend more on paper clips, with spending increasing on average by £100 for each pound donated.”
But, of course, always remember that correlation and causation are not the same thing.
So if you’re investigating paper clip spending, are you also getting the following figures?
- Total spending to provide context?
- Geographical/historical/other breakdowns to provide comparative data?
- The additional data you need to ensure comparisons are fair, such as population size?
- Other data that might provide interesting analysis to compare or relate the spendin to?