Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: finalize change set file format #15

Closed
wants to merge 16 commits into from

Conversation

yihuang
Copy link
Collaborator

@yihuang yihuang commented Dec 13, 2022

Closes: #14

The changeset file format is:

version: varint
size: varint  # total size of kvpairs
kvpairs: [StoreKVPair] # list of length prefixed protobuf msg
message StoreKVPair {
  delete bool = 1;
  key bytes = 2;
  value bytes = 3;
}

Index files:

  • global history index db, key -> bitmap of block numbers.
  • index file corresponding to a changeset file.
    • version offsets
    • for each version:
      • perfect hash table of keys
      • hash value -> offset in changeset file

Query procedure:

  • locate version number with history index.
  • locate changeset file with version number.
  • locate version section in index file.
  • hash key with the hash table.
  • locate offset in changeset file with hash value.
  • read value from changeset file.

@yihuang yihuang marked this pull request as ready for review December 14, 2022 00:19
iavl/diff.py Outdated Show resolved Hide resolved
Signed-off-by: yihuang <[email protected]>
@yihuang yihuang requested a review from mmsqe December 14, 2022 00:22

last_version = None
offset = 0
output = Path(out_dir) / f"block-{start_version}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if it's a tradeoff to get rid of small files but we might overwrite output with different end_version but with same start_version

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's to avoid too many small files, more performant to process and easier to distribute.

we might overwrite output with different end_version but with same start_version

currently it don't override exiting files, but it could be an issue if the chunk files are not continuously or overlapping, but that's an issue of operation, for example, use a fixed granularity.

```
version: varint
size: varint # the total size of kv-pairs, so we can skip faster
kv-pairs: length prefixed proto msg
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we need delimiter for this len, msg, len, msg,…

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean: msg, msg, ..., remove the length prefix?

Copy link
Collaborator Author

@yihuang yihuang Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think protobuf won't work without an end mark, unless we encode the static fields directly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean align like this delimiter

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference?

@yihuang yihuang marked this pull request as draft January 9, 2023 01:44
@yihuang
Copy link
Collaborator Author

yihuang commented Jan 9, 2023

golang version will have more complete support for change set management: crypto-org-chain/cronos#791

@yihuang yihuang closed this Jan 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: finalize changeset file format
2 participants