Understanding the data structure for Sync mechanism #63

ABresting · 2023-11-20T15:54:09Z

This document provides insight into the type of data structure a Sync mechanism should have. Introduction to Sync store can be found here #62

The Sync mechanism should use messageHash as the central entity so a node can uniquely understand the subjected Waku message. So we aim to synchronize a bunch of messageHashes let's say a list of them [msgHash1,msgHash2,msgHash3,...]. There is no need to have a key-value store where the messageHash is the key and Waku message is the value, because to uniquely sync messageHash itself is sufficient.

Since there is no deterministic ordering in that case we follow some sort of ordering approach. Serialized based on:

a) Waku message timestamp
b) in case of conflict then chronological order based on messageHash i.e. 0xaaaa -> 0xabbbb.

The ordering will help in any of the following solutions we aim to go forward with, such as:

Using some sort of Prolly trees into store Prolly Trees under Waku synchronization process. #71
Simply swapping the key-value store itself to a synchronised backend, such as GossipLog Description of GossipLog in the context of Waku #68
Spin out some in-house technique to sync entries #

The text was updated successfully, but these errors were encountered:

SionoiS · 2023-11-20T21:10:31Z

The Sync mechanism should use messageHash as the central entity so a node can uniquely understand the subjected Waku message.

💯

There is no need to have a key-value store where the messageHash is the key and Waku message is the value, because to uniquely sync messageHash itself is sufficient.

How so? We could assume that all message have valid RLN proof to save some bits but we need the payload at some point no?

Since there is no deterministic ordering in that case we follow some sort of ordering approach. Serialized based on:

a) Waku message timestamp b) in case of conflict then chronological order based on messageHash i.e. 0xaaaa -> 0xabbbb.

The ordering of the messages should be b to enable binary search. The indexes would allow efficient search by other field (time, topics, payload?).

The ordering will help in any of the following solutions we aim to go forward with, such as:

Using some sort of Prolly trees into store #

Simply swapping the key-value store itself to a synchronised backend, such as GossipLog #

Spin out some in-house technique to sync entries #

All those are very similar.
GossipLog use Prolly trees under the hood and the only way I know to do 3. would be to use Prolly trees or MSTs.

ABresting · 2023-11-21T21:36:22Z

How so? We could assume that all message have valid RLN proof to save some bits but we need the payload at some point no?

Yeah, so basically the aim here is to sync the messageHashes since using only that info we can do later operation/some-sort-of--filtering using actual message which a messageHash represents.

Since there is no deterministic ordering in that case we follow some sort of ordering approach. Serialized based on:
a) Waku message timestamp b) in case of conflict then chronological order based on messageHash i.e. 0xaaaa -> 0xabbbb.

The ordering of the messages should be b to enable binary search. The indexes would allow efficient search by other field (time, topics, payload?).

if we do the ordering of messageHashes then we can makes a Prolly tree or sync mechanism, indexes will help later once we realize which exact messages are to be transported (by querying them from DB/indexes).

SionoiS · 2023-11-21T22:26:15Z

Yeah, so basically the aim here is to sync the messageHashes since using only that info we can do later operation/some-sort-of--filtering using actual message which a messageHash represents.

I misunderstood, the Sync mechanism would only use hashes right?

if we do the ordering of messageHashes then we can makes a Prolly tree or sync mechanism, indexes will help later once we realize which exact messages are to be transported (by querying them from DB/indexes).

What I envision;

KV store for #s and messages.
One Prolly tree based index per message field we need to search.
Sync can be done with any query on any index (it's just a set diff).
Stream the messages to the node that sent the query.

ABresting self-assigned this Nov 20, 2023

ABresting mentioned this issue Nov 20, 2023

Sync store baseline understanding #62

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding the data structure for Sync mechanism #63

Understanding the data structure for Sync mechanism #63

ABresting commented Nov 20, 2023 •

edited

Loading

SionoiS commented Nov 20, 2023

ABresting commented Nov 21, 2023

SionoiS commented Nov 21, 2023

Understanding the data structure for Sync mechanism #63

Understanding the data structure for Sync mechanism #63

Comments

ABresting commented Nov 20, 2023 • edited Loading

SionoiS commented Nov 20, 2023

ABresting commented Nov 21, 2023

SionoiS commented Nov 21, 2023

ABresting commented Nov 20, 2023 •

edited

Loading