-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathFromStaticToInteractive.qmd
152 lines (119 loc) · 4.52 KB
/
FromStaticToInteractive.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
title: "The observable plot example"
author: "Norah Jones- Modified by Pablo Adames"
format:
html:
code-fold: true
---
## Motivation
When an interactive visualization is the goal, the recent observable plot module is pretty convenient.
The data can be obtained up to date from https://www.ncdc.noaa.gov/cdo-web/search for the Seattle-Tacoma international airport weather station dating back to 1941.
## Some data cleaning
```{r data-cleaning, echo=TRUE}
raw_data <- read.csv("data/3868957.csv", header=TRUE, stringsAsFactors=FALSE)
df <- data.frame( dates=as.Date(raw_data$DATE, format = "%Y-%m-%d"))
df["precipitation"] <- raw_data$PRCP
df$STATION <- raw_data$STATION
df$NAME <- raw_data$NAME
head(df$precipitation)
write.csv(df, "data/seatle-data.csv", row.names = FALSE)
```
## Seattle Precipitation by Day (2012 to 2023)
Now we can create the interactive visualization to investigate how net precipitation has evolved over the past 11 years at the Seattle-Tacoma International airport weather station.
```{ojs}
precipitations = FileAttachment("data/seatle-data.csv")
.csv({typed: true})
//create views
viewof beginYear = Inputs.range(
[2012, 2023],
{value: 2012, step: 1, label: "Begin year:"}
)
viewof endYear = Inputs.range(
[beginYear, 2023],
{ value: [beginYear],
step: 1, label: "End year:"
}
)
viewof stat = Inputs.radio(
["mean", "max", "min"],
{ value: "mean",
label: "Statistic:"
}
)
//filter the data read from file
filtered = precipitations.filter(function(precipitations) {
let years = new Date(precipitations.dates).getYear() + 1900;
return (years >= beginYear) && (years <= endYear)
})
// use the filtered data
Plot.plot({
width: 800, height: 500, padding: 0,
color: { scheme: "blues", type: "sqrt"},
y: { tickFormat: i => "JFMAMJJASOND"[i] },
marks: [
Plot.cell(filtered, Plot.group({fill: stat}, {
x: d => new Date(d.dates).getDate(),
y: d => new Date(d.dates).getMonth(),
fill: "precipitation",
inset: 0.5
}))
]
})
```
------------------------------------------------------------------------
## Improvements
The numerical result of the statistical functions is mapped to the intensity of the blue colour. However the range of the colour is dependent on the years we are filtering by.
To make this dimension more useful to discover real trends in precipitation over the years, let's transform the precipitation using the Z-score standarization for the whole data set.
We center the data using the mean and normalize using the standard deviation.
\begin{eqnarray*} p_z &=& \frac{p - \mu_p}{\sigma_p}\\ \end{eqnarray*}
```{r normalizing-precipitation, echo=TRUE}
df$precipitation_std <- (df$precipitation - mean(df$precipitation, na.rm = TRUE))/sd(df$precipitation, na.rm = TRUE)
head(df$precipitation_std)
write.csv(df, "data/seatle-data-std.csv", row.names = FALSE)
```
```{ojs}
precipitations_std = FileAttachment("data/seatle-data-std.csv")
.csv({typed: true})
//create views
viewof beginYear_std = Inputs.range(
[2012, 2023],
{value: 2012, step: 1, label: "Begin year:"}
)
viewof endYear_std = Inputs.range(
[beginYear, 2023],
{ value: [beginYear],
step: 1, label: "End year:"
}
)
viewof stat_std = Inputs.radio(
["mean", "max", "min"],
{ value: "mean",
label: "Statistic:"
}
)
//filter the data read from file
filtered_std = precipitations_std.filter(function(precipitations_std) {
let years = new Date(precipitations_std.dates).getYear() + 1900;
return (years >= beginYear_std) && (years <= endYear_std)
})
// use the filtered data
Plot.plot({
width: 800, height: 500, padding: 0,
color: { scheme: "blues", type: "sqrt"},
y: { tickFormat: i => "JFMAMJJASOND"[i] },
marks: [
Plot.cell(filtered_std, Plot.group({fill: stat_std}, {
x: d => new Date(d.dates).getDate(),
y: d => new Date(d.dates).getMonth(),
fill: "precipitation_std",
inset: 0.5
}))
]
})
```
## Conclusion
The interactivity brings up questions about the data set and the operations carried out on the data to achieve the visualization itself.
Z-score standarization was used to normalized the precipitation over th range of years of data before filtering it to investigate the effect of the passage of time.
Min-max normalization could have been investigated as well.
The effect of this transformation is not too dramatic but the range of time in this data set may not be sufficiently long to observe large contrast.
Interactivity can be useful to express a time series over a complex time plane to study seasonality effects.