Skip to content

Commit 5950f01

Browse files
Chapters 10 and 11 on proportions and nested proportions (#10)
* Chapter 3 slides * Ch 10: proportions * Chapter 11: nested proportions * Remove numbering from one section * Adding additional page breaks to make notes more slide-like --------- Co-authored-by: Lydia Gibson <[email protected]>
1 parent 58adbea commit 5950f01

File tree

6 files changed

+235
-10
lines changed

6 files changed

+235
-10
lines changed

10_visualizing-proportions.Rmd

Lines changed: 124 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,129 @@
11
# Visualizing proportions
22

3-
**Learning objectives:**
3+
## How to plot percentages: it depends {-}
44

5-
- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
5+
> Many authors categorically reject pie charts and argue in favor of side-by-side or stacked bars. Others defend the use of pie charts in some applications. My own opinion is that none of these visualizations is consistently superior over any other. **Depending on the features of the dataset and the specific story you want to tell, you may want to favor one or the other approach.**
66
7-
## SLIDE 1 {-}
7+
## Pie {-}
88

9-
- ADD SLIDES AS SECTIONS (`##`).
10-
- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
9+
### Where it works {-}
10+
11+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/bundestag-pie-1.png)
12+
13+
- Few groups
14+
- Simple fractions
15+
- Perceptually obvious pattern
16+
17+
## Where it fails {-}
18+
19+
![](images/packed_pie.png)
20+
21+
- More groups
22+
- Perceptual problems with radial area
23+
- Ranking groups (due to percetual problems)
24+
25+
## Other viz strategies {-}
26+
27+
#### Stacked bar {-}
28+
29+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/bundestag-stacked-bars-1.png)
30+
31+
#### Side-by-side bar {-}
32+
33+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/bundestag-bars-side-by-side-1.png)
34+
35+
**Good:**
36+
37+
Relative comparisons:
38+
39+
- Magnitude
40+
- Ranking
41+
42+
**Bad:**
43+
44+
Comparison of part to the whole
45+
46+
## Waffle {-}
47+
48+
![](https://assets.publishing.service.gov.uk/media/634e9eeed3bf7f6186e6bd44/9.png)
49+
50+
Source: https://www.gov.uk/government/publications/a-bite-sized-guide-to-visualising-data-a-dstl-biscuit-book/a-bite-sized-guide-to-visualising-data
51+
52+
- Equally sweet when pie charts are
53+
- But overcomes the radial area problem
54+
55+
## Side-by-side bars {-}
56+
57+
#### Less bad than the alternatives {-}
58+
59+
Cannot measure pie slices--neither at one time nor over time
60+
61+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/marketshare-pies-1.png)
62+
63+
Cannot easily measure relative sizes of bars (although trend over time is clearer)
64+
65+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/marketshare-stacked-1.png)
66+
67+
## Where it works {-}
68+
69+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/marketshare-side-by-side-1.png)
70+
71+
- Many groups
72+
- Comparisons both witin a time period and across time periods
73+
74+
## But could it be made better? {-}
75+
76+
Group by company rather than time
77+
78+
![](images/differently_grouped_bars.png)
79+
80+
Use a line rather than bars
81+
82+
![](images/grouped_bar_as_lines.png)
83+
84+
## Stacked bars {-}
85+
86+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/women-parliament-1.png)
87+
88+
When:
89+
90+
- few--ideally two--groups
91+
- (line showing majority)
92+
93+
## Stacked densities {-}
94+
95+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/health-vs-age-1.png)
96+
97+
When
98+
99+
- x axis represents a continuous variable
100+
- few groups
101+
102+
Since "[s]tacked densities can be thought of as the limiting case of infinitely many infinitely small stacked bars arranged side-by-side", they have suffer from a similar problem:
103+
104+
- Judging relative proportion of groups over time (where magnitude changes are not obvious)
105+
106+
When not:
107+
108+
- Focus on absolute numbers
109+
110+
## Percentages separately as parts of a whole {-}
111+
112+
### Group count compared to total count {-}
113+
114+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/marital-vs-age-facets-1.png)
115+
116+
Works well:
117+
118+
- Part of whole (via "perceptual percentage")
119+
- Trend over time
120+
121+
Some shortfalls:
122+
123+
- Measuring relative size of proportion is imprecise and impressionistic
124+
- Comparing across groups for time T involves lots of looking back and forth (better if facets in single column?)
125+
- Doesn't actually display a proper %
126+
127+
### Relative percentages {-}
128+
129+
![](https://clauswilke.com/dataviz/visualizing_proportions_files/figure-html/marital-vs-age-proportions-1.png)
Lines changed: 111 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,116 @@
11
# Visualizing nested proportions
22

3-
**Learning objectives:**
3+
## Nested proportions {-}
44

5-
- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
5+
![](https://upload.wikimedia.org/wikipedia/commons/4/41/Floral_matryoshka_set_2_smallest_doll_nested.JPG)
66

7-
## SLIDE 1 {-}
7+
By <a href="//commons.wikimedia.org/wiki/User:BrokenSphere" title="User:BrokenSphere">BrokenSphere</a> - <span class="int-own-work" lang="en">Own work</span>, <a href="https://creativecommons.org/licenses/by-sa/3.0" title="Creative Commons Attribution-Share Alike 3.0">CC BY-SA 3.0</a>, <a href="https://commons.wikimedia.org/w/index.php?curid=3773186">Link</a>
88

9-
- ADD SLIDES AS SECTIONS (`##`).
10-
- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
9+
Part of a whole, broken down into parts of that part, etc.
10+
11+
For example: percent of households in the country with access to electricity, by region and by urban/rural
12+
13+
## _Caveat visualizator_ {-}
14+
15+
A naïve attempt to use the strategies from the previous chapter won't work:
16+
17+
- If one sums the % of each group, it sums to more than 100%.
18+
- Because each visualized category to multiple groups (e.g., bridge era and bridge material)
19+
20+
![](https://clauswilke.com/dataviz/nested_proportions_files/figure-html/bridges-bars-bad-1.png)
21+
22+
## What may work {-}
23+
24+
- Mosaic plots / treemaps
25+
- Nested pies
26+
- Parallel sets
27+
28+
## Mosiac plots {-}
29+
30+
Each area belongs to two or more groups.
31+
32+
In the example below, each bridge is both part of a era group (crafts, emerging, mature, modern) and a material group (wood, iron, steel)
33+
34+
![](https://clauswilke.com/dataviz/nested_proportions_files/figure-html/bridges-treemap-1.png)
35+
36+
## Treemaps {-}
37+
38+
Each child group belongs to one parent group.
39+
40+
For the example below, each state belongs to one and only one region.
41+
42+
![](https://clauswilke.com/dataviz/nested_proportions_files/figure-html/US-states-treemap-1.png)
43+
44+
## Problems of mosaic / treemaps {-}
45+
46+
Gets visually busy very fast
47+
48+
Hard to judge size, among other reasons, because of the different shapes of groups
49+
50+
## Nested pie {-}
51+
52+
How:
53+
54+
- Make combination of groups is a slice of the pie.
55+
- Use colors and color scales to telegraph which slice belongs to which bigger group
56+
57+
Problems:
58+
59+
- Radial area problem
60+
- Readability declines with the number of slices
61+
62+
![](https://clauswilke.com/dataviz/nested_proportions_files/figure-html/bridges-nested-pie2-1.png)
63+
64+
## Parallel sets {-}
65+
66+
![](https://clauswilke.com/dataviz/nested_proportions_files/figure-html/bridges-parallel-sets1-1.png)
67+
68+
## Other strategies {-}
69+
70+
- Sunburst
71+
- Coxcomb / Nightingale
72+
- Multiple charts: overview; selective drill-down
73+
- Interactive drill-down
74+
75+
## Sunburst {-}
76+
77+
### Fixed length {-}
78+
79+
Aka radial treemap (?)
80+
81+
- Inner ring: % of group level 1
82+
- Outer ring: % of group level 2 within level 1
83+
84+
![](https://www.pipinghotdata.com/posts/2021-06-01-custom-interactive-sunbursts-with-ggplot-in-r/custom-interactive-sunbursts-with-ggplot-in-r_files/figure-html5/unnamed-chunk-14-1.png)
85+
86+
### Varying length {-}
87+
88+
Length of bars: another attribute within the second level--in the case below, median salary.
89+
90+
![](https://www.pipinghotdata.com/posts/2021-06-01-custom-interactive-sunbursts-with-ggplot-in-r/custom-interactive-sunbursts-with-ggplot-in-r_files/figure-html5/unnamed-chunk-13-1.png)
91+
92+
### Nice resources {-}
93+
94+
- [Overview](https://www.data-to-viz.com/graph/sunburst.html)
95+
- [Extended worked example](https://www.pipinghotdata.com/posts/2021-06-01-custom-interactive-sunbursts-with-ggplot-in-r/)
96+
- [R package](https://sachijay.github.io/ggsunburst/index.html)
97+
98+
## Coxcomb / Nightingale {-}
99+
100+
![](https://upload.wikimedia.org/wikipedia/commons/1/17/Nightingale-mortality.jpg)
101+
102+
Cause of death
103+
104+
- Blue: preventable disease
105+
- Red: wounds
106+
- Black: all other causes
107+
108+
## Multiple charts: overview; selective drill-down {-}
109+
110+
Just imagine there was drill-down into one category
111+
112+
![](images/hhsize_by_region_and_by_sector.png)
113+
114+
## Interactive drill-down {-}
115+
116+
See [here](https://quarto.org/docs/interactive/ojs/index.html#example).
252 KB
Loading

images/grouped_bar_as_lines.png

70.8 KB
Loading
126 KB
Loading

images/packed_pie.png

140 KB
Loading

0 commit comments

Comments
 (0)