Skip to content

Commit 779f6d9

Browse files
authored
Slides for ch17 (#14)
* Slides for ch17 * Slides for chapter 17 * Update to chapter 17 I incorporated the changes requested.
1 parent 1ac272e commit 779f6d9

20 files changed

+122
-8
lines changed

17_the-principle-of-proportional-ink.Rmd

Lines changed: 77 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,82 @@
22

33
**Learning objectives:**
44

5-
- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
5+
- Understand the principle of proportional ink
66

7-
## SLIDE 1 {-}
7+
## The principle of proportional ink {-}
88

9-
- ADD SLIDES AS SECTIONS (`##`).
10-
- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
9+
> The principle of proportional ink, coined by Bergstromm and West in 2016, states that **the sizes of shaded areas in a visualization need to be proportional to the data values they represent.**
10+
11+
- Ink commonly refers to any part of the visualization that deviates from the background color: lines, points, shared areas, text.
12+
13+
- This principle refers to bar graphs or area graphs where the height and the length of the bar represent the data shown. This does not apply to dots or points on a graph, for example.
14+
15+
## Visualizations along linear axes {-}
16+
17+
- Consider the following graph that shows the median income in the five counties that make up the state of Hawaii.
18+
- This graph is misleading because while the endpoint of each bar correctly represents the actual median income in each county, the bar height represents the extent to which median incomes exceed $50,000, an arbitrary number.
19+
20+
![](images/fig17_01.png)
21+
22+
- Similar issues arise with area graphs.
23+
24+
![](images/fig17_03.png)
25+
26+
## How to properly represent visualizations along linear axes {-}
27+
28+
- To represent small changes over time or differences between conditions, then consider showing the change in median income in Hawaiian counties from 2010 to 2015.
29+
30+
![](images/fig17_05.png)
31+
32+
- Similarly, we could shos the change in Facebook stock price over time as the difference from its temporary high point on Oct 22 2016. The shaded area now represents the distance from the high point.
33+
34+
![](images/fig17_06.png)
35+
36+
## Visualization along logarithmic axes {-}
37+
38+
- On a linear scale, bar/rectangle areas are proportional to data values.
39+
- On a logarithmic scale, this proportionality is lost because axis spacing isn’t linear.
40+
- As an example, consider the gross domestic products (GDPs) of countries in Oceania. In 2007, these varied from less than a billion U.S. dollars (USD) to over 300 billion USD.
41+
- Visualizing these numbers on a linear scale would not work, because the two countries with the largest GDPs (New Zealand and Australia) would dominate the figure.
42+
43+
![](images/fig17_07.png)
44+
45+
46+
## Working with log scales that do not work {-}
47+
48+
- However, the visualization with bars on a log scale does not work either. The bars start at an arbitrary value of 0.3 billion USD, and at a minimum the figure suffers from the same problem, that the bar lengths are not representative of the data values.
49+
50+
![](images/fig17_08.png)
51+
52+
- To fix it, change bars to points.
53+
54+
![](images/fig17_09.png)
55+
56+
## How to properly work with log scales in bar plots {-}
57+
58+
- A log scale is the natural scale to visualize ratios, because a unit step along a log scale corresponds to multiplication with or division by a constant factor.
59+
- When bars are drawn on a log scale, they represent ratios and need to be drawn starting from 1, not 0.
60+
- If we want to visualize ratios rather than amounts, however, bars on a log scale are a perfectly good option. In fact, they are preferable over bars on a linear scale in that case.
61+
- As an example, let’s visualize the GDP values of countries in Oceania relative to the GDP of Papua New Guinea.
62+
63+
![](images/fig17_10.png)
64+
65+
## Direct area visualization {-}
66+
67+
- Pie charts follow the principle of proportional ink because wedge area (via angle) is proportional to the data value.
68+
- A pie wedge encodes a value as a combination of distances forming an area, which reduces accuracy.
69+
70+
![](images/fig17_11.png)
71+
72+
73+
## Direct area visualiztion in bar graphs {-}
74+
75+
- However, people perceive pie chart areas differently than bar chart areas.
76+
- Human perception is tuned to judge distances more accurately than areas.
77+
- A bar encodes a value as a single distance (length), making it easier to read precisely.
78+
79+
![](images/fig17_12.png)
80+
81+
- The problem that human perception is better at judging distances than at judging areas also occurs in treemaps, which are square versions of pie charts.
82+
83+
![](images/fig17_13.png)

20_redundant-coding.Rmd

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,50 @@
22

33
**Learning objectives:**
44

5-
- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
5+
- Understand the concept of "redundant coding"
66

7-
## SLIDE 1 {-}
7+
## Redundant coding {-}
88

9-
- ADD SLIDES AS SECTIONS (`##`).
10-
- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
9+
- Redundant coding is a term coined by the author.
10+
- It refers to using multiple different aesthetic dimensions to convey information presented on a graph instead of relying only on color.
11+
12+
# Designing legends with redundant coding for scatter plots {-}
13+
14+
- Scatter plots that represent different groups that differ by color.
15+
- The problem arises because the data points in two separate groups are not particularly distinct from each other.
16+
- Using gray scale does not work because sometimes categories cannot be distinguishable enough.
17+
18+
![](images/fig20_1.png)
19+
20+
- To fix this problem: we can use three different symbol shapes, so that the points all look different.
21+
22+
![](images/fig20_2.png)
23+
24+
# Designing legends with redundant coding for line graphs {-}
25+
26+
- In line plots, we could change the line type (solid, dashed, dotted, etc.) but often yields sub-optimal results especiallly for lines that are not straight.
27+
- Also in the following example, notice that the perceived order of the data lines differs from the order of the companies in the legend.
28+
29+
![](images/fig20_3.png)
30+
31+
- To fix this problem: by manually reordering the entries in the legend so they match the preceived ordering in the data.
32+
33+
![](images/fig20_4.png)
34+
35+
## Designing figures without legends {-}
36+
37+
- We can typically make our readers’ lives easier if we eliminate the legend altogether.
38+
- Eliminating the legend means that we design the figure in such a way that it is immediately obvious what the various graphical elements represent, even if no explicit legend is present.
39+
- The general strategy we can employ is called direct labeling, whereby we place appropriate text labels or other visual elements that serve as guideposts to the rest of the figure.
40+
41+
![](images/fig20_5.png)
42+
43+
## Other examples {-}
44+
45+
- This line plot is using two different shades of each color, a light one for filled areas and a dark one for lines, outlines, and text.
46+
47+
![](images/fig20_6.png)
48+
49+
- In the case where we map the same variable onto a position along a major axis and onto color, this implies that the reference color bar should run along and be integrated into the same axis.
50+
51+
![](images/fig20_7.png)

images/fig17_01.png

137 KB
Loading

images/fig17_03.png

137 KB
Loading

images/fig17_05.png

106 KB
Loading

images/fig17_06.png

129 KB
Loading

images/fig17_07.png

124 KB
Loading

images/fig17_08.png

119 KB
Loading

images/fig17_09.png

107 KB
Loading

images/fig17_10.png

116 KB
Loading

0 commit comments

Comments
 (0)