-
-
Notifications
You must be signed in to change notification settings - Fork 0
/
DESIGN
70 lines (56 loc) · 2.58 KB
/
DESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
palloc defines blobs within a medium
Old blob layout
header
8 bytes
1 byte transaction version (0)
7 bytes parent offset
8 bytes increment
entry[]
1-2 bytes key length (0 = end of list)
1-32767 bytes key
8 bytes data length
0-(2^64-1) bytes data
New blob layout
header
1 byte transaction version (0)
15 bytes transaction identifier (randomly generated during commit)
8 bytes height/increment (must be 1 higher than that of highest parent)
8 bytes parent offset [] (0 = end-of-list)
entry[]
1-2 bytes key length (0 = end of list)
1-32767 bytes key
8 bytes data length
0-(2^64-1) bytes data
During GET of a certain key
- Get the current heads, add to processing queue
- Read keys of highest tx in queue
- found -> return
- Add parents of current tx in queue (deduplicate!, insert in-order)
- Iterate, keep reading highest
- Found root of storage, or tombstone = not found
Transaction sync idea (part of keveat, not kvsm):
Nodes have predefined connections (--join <ip>:<port> on cli?)
To start, just a tcp connection, newline-delimited messages
We should maintain a lax approach to syncing data, as udp (future) may lose packets
Hint: bittorrent-like request from multiple, packets declare what they are (like, "parents of I are H[]", instead of "response H[]")
Node A requests current head list from node B
Maybe request heads of null for "current" heads?
B responds with current heads with their increments
While received heads contain unknown identifiers:
(A) requests parent list of identifier I
(B) responds with I's parent identifiers
(A) follows unknown parent path until all parents are known for I
(A) requests contents of I (GET /api/v1/transactions/<I>/contents)
(B) responds with contents
(A) stores transaction, optionally updating it's own heads, keeping track of it's internal height
Initially: Pulls 1 transaction per sync, schedule another sync if anything was pulled
Eventually?: keep track of seen identifiers, pull all?
Including height in sync ensures old data (from orphaned out-of-sync node) doesn't overwrite newer data
Assumes all nodes in the network are trusted
A received transaction must be rejected if it's height doesn't follow the always-incrementing rule
chain root = null = virtual transaction without data, always matches between nodes
Compaction:
Idea stays the same:
Iterate every transaction in storage
Keep transactions with reachable entries
Discard transactions with no reachable entries (update surrounding pointers)