Skip to content
This repository was archived by the owner on Aug 13, 2019. It is now read-only.

Commit 7aac0cd

Browse files
author
Fabian Reinartz
committed
docs: add new WAL format
Signed-off-by: Fabian Reinartz <[email protected]>
1 parent 6a05e6d commit 7aac0cd

File tree

1 file changed

+72
-0
lines changed

1 file changed

+72
-0
lines changed

docs/format/wal.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# WAL Disk Format
2+
3+
The write ahead log operates in segments that that are numbered and sequential,
4+
e.g. `000000`, `000001`, `000002`, etc., and are limited to 128MB by default.
5+
A segment is written to in pages of 32KB. Only the last page of the most recent segment
6+
may be partial. A WAL record is an opaque byte slice that gets split up into sub-records
7+
should it exceed the remaining space of the current page. Records are never split across
8+
segment boundaries.
9+
The encoding of pages is largely borrowed from [LevelDB's/RocksDB's wirte ahead log.][1]
10+
11+
Notable deviations are that the record fragment is encoded as:
12+
13+
┌───────────┬──────────┬────────────┬──────────────┐
14+
│ type <1b> │ len <2b> │ CRC32 <4b> │ data <bytes> │
15+
└───────────┴──────────┴────────────┴──────────────┘
16+
17+
## Record encoding
18+
19+
The records written to the write ahead log are encoded as follows:
20+
21+
### Series records
22+
23+
Series records encode the labels that identifier a series and its unique ID.
24+
25+
┌────────────────────────────────────────────┐
26+
│ type = 1 <1b> │
27+
├────────────────────────────────────────────┤
28+
│ ┌─────────┬──────────────────────────────┐ │
29+
│ │ id <8b> │ n = len(labels) <uvarint> │ │
30+
│ ├─────────┴────────────┬─────────────────┤ │
31+
│ │ len(str_1) <uvarint> │ str_1 <bytes> │ │
32+
│ ├──────────────────────┴─────────────────┤ │
33+
│ │ ... │ │
34+
│ ├───────────────────────┬────────────────┤ │
35+
│ │ len(str_2n) <uvarint> │ str_2n <bytes> │ │
36+
│ └───────────────────────┴────────────────┘ │
37+
│ . . . │
38+
└────────────────────────────────────────────┘
39+
40+
### Sample records
41+
42+
Sample records encode samples as a list of triples `(series_id, timestamp, value)`.
43+
Series reference and timestamp are encoded as deltas w.r.t the first sample.
44+
45+
┌──────────────────────────────────────────────────────────────────┐
46+
│ type = 2 <1b> │
47+
├──────────────────────────────────────────────────────────────────┤
48+
│ ┌────────────────────┬───────────────────────────┬─────────────┐ │
49+
│ │ id <8b> │ timestamp <8b> │ value <8b> │ │
50+
│ └────────────────────┴───────────────────────────┴─────────────┘ │
51+
│ ┌────────────────────┬───────────────────────────┬─────────────┐ │
52+
│ │ id_delta <uvarint> │ timestamp_delta <uvarint> │ value <8b> │ │
53+
│ └────────────────────┴───────────────────────────┴─────────────┘ │
54+
│ . . . │
55+
└──────────────────────────────────────────────────────────────────┘
56+
57+
### Tombstone records
58+
59+
Tombstone records encode tombstones as a list of triples `(series_id, min_time, max_time)`
60+
and specify an interval for which samples of a series got deleted.
61+
62+
63+
┌─────────────────────────────────────────────────────┐
64+
│ type = 3 <1b> │
65+
├─────────────────────────────────────────────────────┤
66+
│ ┌─────────┬───────────────────┬───────────────────┐ │
67+
│ │ id <8b> │ min_time <varint> │ max_time <varint> │ │
68+
│ └─────────┴───────────────────┴───────────────────┘ │
69+
│ . . . │
70+
└─────────────────────────────────────────────────────┘
71+
72+
[1][https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log-File-Format]

0 commit comments

Comments
 (0)