Prolly Tree construction and Waku Message #77

ABresting · 2023-12-14T15:54:12Z

Prolly Tree construction and Waku Message

The Synchronization (Sync) protocol ensures that all nodes in the Waku network have consistent data and messages. It operates between different nodes: a node sends a Sync request to a peer node, which then processes the request and initiates the synchronization using Prolly Tree. A Prolly tree is constructed using a key-value, i.e. in our case it is timestamp as key and messageHash as value. We need Waku messages to construct the Prolly tree. There are two ways to get the Waku message:

Using Waku Relay: We directly get the message (non-ephermal?) from Waku Relay post it is validated, especially RLN validation.
- Messages may come out-of-order? But not older than 20 seconds since relay drops messages older than 20 seconds?
- Explicitly need to compute messageHash for key-value attribute of Prolly tree? if Store protocol also does this then it's twice the work?
Using Waku Store/Archive: We fetch the Waku messages from the Waku Store. This will give us sorted/ordered Waku messages.
- How much time (in seconds?) a message takes once it is Archived (received at relay to getting stored in the DB)?
- One benefit is we get ordered messages (also depends on when we requests them)?
- We need messageHash (not a part of Waku message) to construct Prolly tree , which is computed explicitly.

Essentially we should have some buffer period, let's say 5-10 seconds so that messages come in-order and there is less possibility of out-of-order insertion. It will also allow peer nodes to get delayed messages from network itself.

Let's discuss the pros and cons of both approaches.

@Ivansete-status @jm-clius @alrevuelta @vpavlin @chaitanyaprem

The text was updated successfully, but these errors were encountered:

alrevuelta · 2023-12-15T09:28:40Z

Following waku's modular approach I would lean towards 2) since a node can technically run store but not relay (not sure if that makes sense though since messages have to come from somewhere).

On the other hand, I see store-sync as part of the store protocol, which also justifies 2).

Essentially we should have some buffer period, let's say 5-10 seconds so that messages come in-order and there is less possibility of out-of-order insertion. It will also allow peer nodes to get delayed messages from network itself.

Indeed, good point. In that case, I would wait MaxEpochGap. Since messages older than that are rejected.

chaitanyaprem · 2023-12-15T09:46:29Z

Since Store sync is related to Store protocol and only valid for store nodes (which may not necessarily be relay themselves i.e i can run a node with filter and store connecting to another relay node), we should go with approach 2.

Essentially we should have some buffer period, let's say 5-10 seconds so that messages come in-order and there is less possibility of out-of-order insertion. It will also allow peer nodes to get delayed messages from network itself.

As @alrevuelta suggested, it makes sense to use existing param MaxEpochGap. But i am wondering if there should be additional buffer time to be considered to cater to local processing and storage delay. Not sure how much to consider though.

ABresting · 2023-12-15T10:27:22Z

Thanks @alrevuelta @chaitanyaprem Awesome! this sounds good that we use 2nd option and query data off the Store/Archive after MaxEpochGap.

Ivansete-status · 2023-12-20T15:43:22Z

imo, the Sync protocol should operate only when Store protocol is mounted, and its purpose is to ensure all Store nodes contain the same singleton set of messages. That would require a Sync communication but only between Store nodes.
Therefore, I opt for 2.

jm-clius · 2024-01-03T09:17:06Z

My 2c: although we may choose to mount the sync protocol only as part of Store protocol, conceptually it should be agnostic as to where the message hashes being synced come from (whether from an existing archive/store, relay or filter). We are simply building a way to synchronise between different caches of message hashes in the network. How these caches are initialised and updated is then merely a matter of how we choose to wire the different moving parts together: e.g. periodically updating the prolly tree with hashes from the archive backend, etc. At least for POC purposes I agree that we can simply populate messages older than MaxEpochGap from an existing archive.

SionoiS · 2024-01-03T13:37:31Z

The tree and sync process (Waku Cache?) should be agnostic to the origin and the number of messages.

Inserting messages as they arrive or buffering them first is an implementation detail that doesn't matter too much at this point.

IMO, the flow should be.
Relay (msg) -> message hasher (hash + msg) -> Archive (hash + msg) & Tree/sync (time + hash)

ABresting mentioned this issue Dec 14, 2023

Sync store baseline understanding #62

Open

ABresting mentioned this issue Jan 21, 2024

75/WAKU2-SYNC vacp2p/rfc#660

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prolly Tree construction and Waku Message #77

Prolly Tree construction and Waku Message #77

ABresting commented Dec 14, 2023

alrevuelta commented Dec 15, 2023

chaitanyaprem commented Dec 15, 2023

ABresting commented Dec 15, 2023

Ivansete-status commented Dec 20, 2023

jm-clius commented Jan 3, 2024

SionoiS commented Jan 3, 2024

Prolly Tree construction and Waku Message #77

Prolly Tree construction and Waku Message #77

Comments

ABresting commented Dec 14, 2023