18/7/2012

How to Create the Perfect Line Chart

 

The original version of this post was published by Gregor Aisch on vis4.net, on 20 June 2012. This post has been edited and republished with permission.

 

I recently joined Datawrapper, an open source project that aims to provide simple, embeddable charts for journalists. Really, no fancy stuff here, we're just talking about line charts and bar charts. Limiting ourselves to those types of charts gave us a good opportunity to think about the best way to create them. So it happened that this week I got to think a bit about the perfect line chart.

Listen to Tufte and keep it simple

You cannot talk about perfect charts without mentioning the great books of Edward Tufte. The book The Visual Display of Quantitative Information in particular contains a lot of good advice about creating line charts. He suggests to look at what he called "the data-ink ratio" and shows how the removal of certain chart elements can increase its readability. For instance you don't need to draw a box around the chart area. Also you can use the ends of axes lines to display the minimum and maximum values in the data.

line-charts-tufte.png

Forget about the separate legend

Separate legends are the worst case scenario in the line chart world. Often one can find the legend below the chart, or in an arbitrary order. You want to allow instant identification of the lines, but forcing the viewers to look them up in a legend takes way too much time. Instead you should put the labels somewhere close to the lines.

labeling.png

The great side effect of putting the labels next to the lines is that you no longer depend on fancy colours or disturbing symbols to identify individual lines. Extra points for simplicity.

Highlight what's important

Although it is possible to tell hundreds of stories using a single line chart, it makes a lot of sense to keep the focus on just one story. Therefore you should highlight just one or two important lines in the chart and keep the others in the background as context.

screenshot-2012-06-20-um-13.32.38.png

Zero baseline or not?

Sometimes you hear the advice that every (line) chart should have a baseline of zero, otherwise it would be "lying". As a counter-example, here's the (approximate) intraday stock quote data of the Facebook IPO day using zero baseline. The reason why nobody shows stock charts this way is obvious.  

screenshot-2012-06-20-um-16.59.52.png

It's almost impossible to see the ups and downs of the first day of the Facebook stock. Without the zero baseline the chart reveals much more of the data.

screenshot-2012-06-20-um-17.02.54.png

However, to minimize the risk of confusing the readers with a non-zero baseline chart, I suggest to not draw the axes as connected lines. This way the Y-axis doesn't visually 'touch' the 'ground'.

non-connected-axes.png

Finding a nice aspect ratio

The big advantage of line charts is that they enable the comparison of slopes, which is not easily possible in a bar chart, for instance. The problem, however, is that the perceivable slopes are highly dependent on the aspect ratio of the chart. The Facebook stock data would have looked much more dramatic in a higher chart. So which aspect ratio to choose? Some years ago, William Cleveland suggested a technique called "banking" to solve this problem. The core idea is that the slopes in a line chart are most readable if they average to 45°. In 2006, Jeffrey Heer and Maneesh Agrawala continued the work of Cleveland and described 12 different banking algorithms. I used one of the simplest of them, the "median-absolute-slope" banking.

Finally, here's what the Facebook stock chart looks like after banking. The curve looks less dramatic now, but is still easy to read.

Facebook_IPO.png

The problem with banking is that sometimes you need the chart in a certain aspect ratio to fit into a page layout. Especially if banking produces portrait sized charts. But why not let the optimal chart ratio define your layout? For instance, you can place the additional information to the side of the chart. Remember that the main goal of banking is to increase the readability of the line slopes. In the following example, the slopes for "nuclear" and "renewables" would have been much more difficult to read if the chart would have been fitted to a landscape aspect.

banked-portrait.png

Turning best practices into actual tools

To conclude I am very happy to say that these best practices won't remain just theory. Everything I discussed in this post will be integrated in the upcoming release of Datawrapper, which I've used to produce most of the charts in this post. You can follow @datawrapper if you want to keep up-to-date with the project.

If you have further suggestions or recommendations regarding line charts, I'm looking forward to reading your comments.

Comments