Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding the data structure for Sync mechanism #63

Open
ABresting opened this issue Nov 20, 2023 · 3 comments
Open

Understanding the data structure for Sync mechanism #63

ABresting opened this issue Nov 20, 2023 · 3 comments
Assignees

Comments

@ABresting
Copy link

ABresting commented Nov 20, 2023

This document provides insight into the type of data structure a Sync mechanism should have. Introduction to Sync store can be found here #62

The Sync mechanism should use messageHash as the central entity so a node can uniquely understand the subjected Waku message. So we aim to synchronize a bunch of messageHashes let's say a list of them [msgHash1,msgHash2,msgHash3,...]. There is no need to have a key-value store where the messageHash is the key and Waku message is the value, because to uniquely sync messageHash itself is sufficient.

Since there is no deterministic ordering in that case we follow some sort of ordering approach. Serialized based on:

a) Waku message timestamp
b) in case of conflict then chronological order based on messageHash i.e. 0xaaaa -> 0xabbbb.

The ordering will help in any of the following solutions we aim to go forward with, such as:

  1. Using some sort of Prolly trees into store Prolly Trees under Waku synchronization process. #71
  2. Simply swapping the key-value store itself to a synchronised backend, such as GossipLog Description of GossipLog in the context of Waku #68
  3. Spin out some in-house technique to sync entries #
@SionoiS
Copy link

SionoiS commented Nov 20, 2023

The Sync mechanism should use messageHash as the central entity so a node can uniquely understand the subjected Waku message.

💯

There is no need to have a key-value store where the messageHash is the key and Waku message is the value, because to uniquely sync messageHash itself is sufficient.

How so? We could assume that all message have valid RLN proof to save some bits but we need the payload at some point no?

Since there is no deterministic ordering in that case we follow some sort of ordering approach. Serialized based on:

a) Waku message timestamp b) in case of conflict then chronological order based on messageHash i.e. 0xaaaa -> 0xabbbb.

The ordering of the messages should be b to enable binary search. The indexes would allow efficient search by other field (time, topics, payload?).

The ordering will help in any of the following solutions we aim to go forward with, such as:

  1. Using some sort of Prolly trees into store #
  2. Simply swapping the key-value store itself to a synchronised backend, such as GossipLog #
  3. Spin out some in-house technique to sync entries #

All those are very similar.
GossipLog use Prolly trees under the hood and the only way I know to do 3. would be to use Prolly trees or MSTs.

@ABresting
Copy link
Author

How so? We could assume that all message have valid RLN proof to save some bits but we need the payload at some point no?

Yeah, so basically the aim here is to sync the messageHashes since using only that info we can do later operation/some-sort-of--filtering using actual message which a messageHash represents.

Since there is no deterministic ordering in that case we follow some sort of ordering approach. Serialized based on:
a) Waku message timestamp b) in case of conflict then chronological order based on messageHash i.e. 0xaaaa -> 0xabbbb.

The ordering of the messages should be b to enable binary search. The indexes would allow efficient search by other field (time, topics, payload?).

if we do the ordering of messageHashes then we can makes a Prolly tree or sync mechanism, indexes will help later once we realize which exact messages are to be transported (by querying them from DB/indexes).

@SionoiS
Copy link

SionoiS commented Nov 21, 2023

Yeah, so basically the aim here is to sync the messageHashes since using only that info we can do later operation/some-sort-of--filtering using actual message which a messageHash represents.

I misunderstood, the Sync mechanism would only use hashes right?

if we do the ordering of messageHashes then we can makes a Prolly tree or sync mechanism, indexes will help later once we realize which exact messages are to be transported (by querying them from DB/indexes).

What I envision;

  1. KV store for #s and messages.
  2. One Prolly tree based index per message field we need to search.
  3. Sync can be done with any query on any index (it's just a set diff).
  4. Stream the messages to the node that sent the query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants