|
2 | 2 |
|
3 | 3 | **Learning objectives:** |
4 | 4 |
|
5 | | -- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY |
| 5 | +- Understand the principle of proportional ink |
6 | 6 |
|
7 | | -## SLIDE 1 {-} |
| 7 | +## The principle of proportional ink {-} |
8 | 8 |
|
9 | | -- ADD SLIDES AS SECTIONS (`##`). |
10 | | -- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF. |
| 9 | +> The principle of proportional ink, coined by Bergstromm and West in 2016, states that **the sizes of shaded areas in a visualization need to be proportional to the data values they represent.** |
| 10 | +
|
| 11 | +- Ink commonly refers to any part of the visualization that deviates from the background color: lines, points, shared areas, text. |
| 12 | + |
| 13 | +- This principle refers to bar graphs or area graphs where the height and the length of the bar represent the data shown. This does not apply to dots or points on a graph, for example. |
| 14 | + |
| 15 | +## Visualizations along linear axes {-} |
| 16 | + |
| 17 | +- Consider the following graph that shows the median income in the five counties that make up the state of Hawaii. |
| 18 | + - This graph is misleading because while the endpoint of each bar correctly represents the actual median income in each county, the bar height represents the extent to which median incomes exceed $50,000, an arbitrary number. |
| 19 | + |
| 20 | + |
| 21 | + |
| 22 | +- Similar issues arise with area graphs. |
| 23 | + |
| 24 | + |
| 25 | + |
| 26 | +## How to properly represent visualizations along linear axes {-} |
| 27 | + |
| 28 | +- To represent small changes over time or differences between conditions, then consider showing the change in median income in Hawaiian counties from 2010 to 2015. |
| 29 | + |
| 30 | + |
| 31 | + |
| 32 | +- Similarly, we could shos the change in Facebook stock price over time as the difference from its temporary high point on Oct 22 2016. The shaded area now represents the distance from the high point. |
| 33 | + |
| 34 | + |
| 35 | + |
| 36 | +## Visualization along logarithmic axes {-} |
| 37 | + |
| 38 | +- On a linear scale, bar/rectangle areas are proportional to data values. |
| 39 | +- On a logarithmic scale, this proportionality is lost because axis spacing isn’t linear. |
| 40 | +- As an example, consider the gross domestic products (GDPs) of countries in Oceania. In 2007, these varied from less than a billion U.S. dollars (USD) to over 300 billion USD. |
| 41 | + - Visualizing these numbers on a linear scale would not work, because the two countries with the largest GDPs (New Zealand and Australia) would dominate the figure. |
| 42 | + |
| 43 | + |
| 44 | + |
| 45 | + |
| 46 | +## Working with log scales that do not work {-} |
| 47 | + |
| 48 | +- However, the visualization with bars on a log scale does not work either. The bars start at an arbitrary value of 0.3 billion USD, and at a minimum the figure suffers from the same problem, that the bar lengths are not representative of the data values. |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | +- To fix it, change bars to points. |
| 53 | + |
| 54 | + |
| 55 | + |
| 56 | +## How to properly work with log scales in bar plots {-} |
| 57 | + |
| 58 | +- A log scale is the natural scale to visualize ratios, because a unit step along a log scale corresponds to multiplication with or division by a constant factor. |
| 59 | +- When bars are drawn on a log scale, they represent ratios and need to be drawn starting from 1, not 0. |
| 60 | +- If we want to visualize ratios rather than amounts, however, bars on a log scale are a perfectly good option. In fact, they are preferable over bars on a linear scale in that case. |
| 61 | +- As an example, let’s visualize the GDP values of countries in Oceania relative to the GDP of Papua New Guinea. |
| 62 | + |
| 63 | + |
| 64 | + |
| 65 | +## Direct area visualization {-} |
| 66 | + |
| 67 | +- Pie charts follow the principle of proportional ink because wedge area (via angle) is proportional to the data value. |
| 68 | +- A pie wedge encodes a value as a combination of distances forming an area, which reduces accuracy. |
| 69 | + |
| 70 | + |
| 71 | + |
| 72 | + |
| 73 | +## Direct area visualiztion in bar graphs {-} |
| 74 | + |
| 75 | +- However, people perceive pie chart areas differently than bar chart areas. |
| 76 | +- Human perception is tuned to judge distances more accurately than areas. |
| 77 | +- A bar encodes a value as a single distance (length), making it easier to read precisely. |
| 78 | + |
| 79 | + |
| 80 | + |
| 81 | +- The problem that human perception is better at judging distances than at judging areas also occurs in treemaps, which are square versions of pie charts. |
| 82 | + |
| 83 | + |
0 commit comments