|
1 |
| -# Perform operations on groups of rows |
| 1 | +--- |
| 2 | +sheet: Sheet |
| 3 | +--- |
| 4 | +# Create a window over consecutive rows |
2 | 5 |
|
3 |
| -The window function creates a new column where each row contains of rows before and/or after the current row in the source column. |
| 6 | +Window functions enable computations that relate the current window to surrounding rows, like cumulative sum, rolling averages or lead/lag computations. |
4 | 7 |
|
5 |
| -Window functions enable computations that relate the current window to surrounding rows, for example: |
6 |
| -- cumulative sum |
7 |
| -- rolling averages |
8 |
| -- lead/lag computations |
| 8 | +{help.commands.addcol-window} |
9 | 9 |
|
10 |
| -## Window functions operation on columns |
| 10 | +With large window sizes, [:code]g'[/] (`freeze-sheet`) to calculate all cells and copy the entire sheet into a new source sheet, which will conserve CPU. |
11 | 11 |
|
12 |
| -Create a window for a column. The new column will contain the current row, and also any before or after rows specified when creating the window. |
| 12 | +## Examples |
13 | 13 |
|
14 |
| -- {help.command.addcol-window} |
| 14 | + date color price |
| 15 | + ---------- ----- ----- |
| 16 | + 2024-09-01 R 30 |
| 17 | + 2024-09-02 B 28 |
| 18 | + 2024-09-03 R 100 |
| 19 | + 2024-09-03 B 33 |
| 20 | + 2024-09-03 B 99 |
15 | 21 |
|
16 |
| -To conserve memory and speed with large windows, one approach is to: |
17 |
| -1. add any expressions that operate on the window expression. |
18 |
| -2. Freeze the sheet [:keys]g'[/]. |
19 | 22 |
|
20 |
| -## Examples |
| 23 | +1. [:keys]#[/] (`type-int`) on the **price** column to type as int. |
| 24 | +2. [:keys]w[/] (`addcol-window`) on the **price** column, followed by `1 2`, to create a window consisting of 4 rows: 1 row before the current row, and 2 rows after. |
| 25 | +3. To create a moving average of the values in the window, add a new column with a python expression: [:keys]=[/] (`addcol-expr`) |
| 26 | +followed by `sum(price_window)/len(price_window)` |
21 | 27 |
|
22 |
| -After creating a window, use a python expression to operate on it. |
| 28 | +date color price price_window sum(price_window)/len(price_window) |
| 29 | +---------- ----- ----- ------------------- ----------------------------------- |
| 30 | +2024-09-01 R 38 [4] ; 38; 28; 100 41.5 |
| 31 | +2024-09-02 B 28 [4] 38; 28; 100; 33 49.75 |
| 32 | +2024-09-03 R 100 [4] 28; 100; 33; 99 65.0 |
| 33 | +2024-09-03 B 33 [4] 100; 33; 99; 58.0 |
| 34 | +2024-09-03 B 99 [4] 33; 99; ; 33.0 |
23 | 35 |
|
24 |
| -For example, given a windown column 'win', to create a moving average of the |
25 |
| -values in the window, add a new column with a python expression. |
26 | 36 |
|
27 |
| -``` |
28 |
| -=sum(win)/len(win) |
29 |
| -``` |
| 37 | +## Workflows |
30 | 38 |
|
31 | 39 | ### Create a cumulative sum
|
32 | 40 |
|
33 |
| -- set the before window size to >= the total number of rows in the table, and the after rows to 0. |
34 |
| -- add an expression of `sum(windows)` where `window` is the name of the window function column. |
| 41 | +1. Set the before window size to the total number of rows in the table, and the after rows to 0. In the above example that would be `w 5 0` (`addcol-window`). |
| 42 | +2. Add an expression ([:keys]=[/] (`addcol-expr`) of `sum(window)` where `window` is the name of the window function column. |
35 | 43 |
|
36 | 44 | ### Compute rank
|
37 | 45 |
|
38 |
| -https://github.com/saulpw/visidata/discussions/2280#discussioncomment-8314593 |
| 46 | +See https://github.com/saulpw/visidata/discussions/2280 for a discussion on how to use window functions to compute a rank column, where the rank restarts from 1 each time the value changes. E.g: |
| 47 | + |
| 48 | +value rank |
| 49 | +----- ---- |
| 50 | +A 1 |
| 51 | +A 2 |
| 52 | +B 1 |
| 53 | +C 1 |
| 54 | +C 2 |
| 55 | +C 3 |
39 | 56 |
|
40 | 57 | ### Compute the change between rows
|
41 | 58 |
|
42 |
| -1. Create a window function of size 1 before and 0 after |
43 |
| -2. Add a python expression. Assume the window function column is 'win', and the current (integer) column is named seconds: |
| 59 | +1. `w 1 0` to create a window function of size 1 before and 0 after |
| 60 | +2. Add a python expression. Assume the window function column is 'win': |
44 | 61 | `=win[1] - win[0] if len(win) > 1 else None`
|
45 | 62 |
|
46 |
| - |
|
0 commit comments