24/1/2013

## When Should I Use Logarithmic Scales in my Charts and Graphs?

*Originally published by Naomi Robbins on blogs.forbes.com. Republished with permission.*

There are two main reasons to use logarithmic scales in charts and graphs. The first is to respond to skewness towards large values; i.e., cases in which one or a few points are much larger than the bulk of the data. The second is to show percent change or multiplicative factors. First I will review what we mean by logarithms. Then I will provide more detail about each of these reasons and give examples.

To refresh your memory of school math, logs are just another way of writing exponential equations, one that allows you to separate the exponent on one side of the equation. The equation 2⁴ = 16 can be rewritten as log 2 16 = 4 and pronounced “log to the base 2 of 16 is 4.” It is helpful to remember that the log is the exponent, in this case, “4″. The equation y = logb (x) means that y is the power or exponent that b is raised to in order to get x. The common base for logarithmic scales is the base 10. However, other bases are also useful. While a base of ten is useful when the data range over several orders of magnitude, a base of two is useful when the data have a smaller range.

*Figure 1. Dot plot of revenues of the top 60 Fortune 500 companies. Data Source: http://money.cnn.com/magazines/fortune/fortune500/2011/full_list/*

Figure 1 uses a dot plot to show the revenues of the top 60 companies on the 2011 Fortune 500 list which provides revenues for 2010. One reason for choosing a dot plot rather than a bar chart is that it is less cluttered. We will be learning other benefits of dot plots in this and future posts.

Wal-mart Stores and Exxon-Mobil have much larger revenues than the other companies. As a result, the differences in the revenues of the other companies are compressed, making these differences more difficult to judge.

*Figure 2. Dot plot of revenues of top 60 Fortune 500 companies on a log scale with base 2.*

The same data are plotted in Figure 2 on a logarithmic scale with base 2. My reason for using base 2 was to avoid the tick marks with decimal exponents that base 10 would have produced. The data range from about 40 to about 400. That’s not too many orders of magnitude. Figure 3 plots the data with logs to the base 10 with tick labels in powers of ten. If we want more than one or two tick marks we get the decimal exponents shown in Figure 3. Using the base 2 avoids this problem. Next week we will discuss alternative ways of labeling log scales.

*Figure 3. Dot plot of data of Figure 2 shown on a log scale with base of 10*

A dot plot is judged by its position along an axis; in this case, the horizontal or x axis. A bar chart is judged by the length of the bar. I don’t like using lengths with logarithmic scales. That is a second reason that I prefer dot plots over bar charts for these data.

In Figure 2, the value of each tick mark is double the value of the preceding one. The top axis emphasizes the fact the data are logs. The bottom axis shows the values in the original scale. This labeling follows the advice of William Cleveland with the top and bottom axes interchanged. The data values are spread out better with the logarithmic scale. This is what I mean by responding to skewness of large values. The revenue for Boeing is about 2⁶ billion dollars while the revenue for Ford Motor is about 2⁷. In Figure 1, the linear scale, the revenue for Ford is the revenue for Boeing **plus** the difference between these two revenues. We call this additive. Since 2⁶ = 64 and 2⁷ = 128, we see that the difference is about 64 billion dollars. In Figure 2 the difference is multiplicative. Since 2⁷ = 2⁶ **times** 2, we see that the revenues for Ford Motor are about double those for Boeing. This is what I mean by saying that we use logarithmic scales to show multiplicative factors.

The previous example showed both responding to large values and multiplicative factors. The next example just describes rates of change. Suppose we had one widget in 1999 and doubled the number each year. The following charts show the number of widgets on a linear and logarithmic scale:

*Figure 4. A comparison of linear and logarithmic (log) scales*

The linear scale shows the absolute number of widgets over time while the logarithmic scale shows the rate of change of the number of widgets over time. The bottom chart of Figure 4 makes it much clearer that the rate of change or growth rate is constant.

Dr. Nicolas Bissantz in his blog, Me, Myself, and BI, would call the linear chart a *panic chart.* He says that “line charts are speed charts.” That is, they show the rate of change or slope of the number of widgets. A chart with a linear scale similar to the top chart of Figure 4 showing a quantity such as our national debt causes panic even if the rate of change is constant.

Logarithmic scales are extremely useful but are not understood by all. As in all presentations, designers must know their audiences.