Discussion: Versions should use hashes #54

martinheidegger · 2018-12-16T04:39:51Z

Currently in hyperdrive, hypercore, beaker browser (and probably at a few other tools) versions are specified as length of the append-log (a number). However, that is not a safe specification of a version.

Problem case: a researcher wants to specify exactly which version of a DAT is used, and specifies it like dat://ab...ef+234. The researcher notices that the data-set doesn't fit the output, reverts to version 1 and creates a new DAT with exactly 234 versions to fit the output. With this the researcher just managed to specify false claims.

How to make sure this never happens? Each version of a hypercore creates a hash.
Which makes one version of a hyperdrive a combinations of various hypercore versions.

Specifying a dat version like this though:

dat://<channel:64-hex-chars>+<metadata:64-hex-chars>+<content:64-hex-chars>

... for a single-writer-dat. Which would become even more of a hassle with a
multi-writer-dat (1 key for the channel + 2 hashes per writer). Note: I know that it could be okay to have only the first 8 characters as version identification, but that would probably not be good enough for a researcher.

Thinking about this for a little, I got following solution which might be a good idea for a new DEP:

(Single-writer for the sake of simplicity)

We could add another version hypercore to a hyperdrive, that keeps an index of the versions and hashes:

{
  string hash = 0; // Hash of the version (calculated by hashing all hashes in here)
  repeated string tags = 1; // Names to find this version by
  int32 metadataLength = 2; // Length of the metadata-core
  string metadataHash = 3; // Hash for the version on the metadata-core
  int32 contentLength = 4; // Length of the content-core
  string contentHash = 5; // Hash for the version of the content-core
}

This way a version checkout could download all versions of the version hypercore, create a lookup-table and select the version based on that lookup-table.

My questions now are:

Is this a reasonable approach? Do you know a better way to get that done?
How could a multi-writer version look like?
Should this be turned into a DEP?

The text was updated successfully, but these errors were encountered:

pfrazee · 2018-12-17T17:10:02Z

Good ideas. Maf and I have been discussing this. I'll let maf comment on what you're suggesting but I'll dump what I know has been done in this area:

Strongly-versioned links aka Strong Links. Links which include a content hash to verify their content. We have all the internal code needed for this IIRC, and it's been up to me to implement them. Our idea was to add another + to the version, so that it looked like this: dat://{pubkey}+{n}+{hash}/.
Version tags. String identifiers that can be used to identify specific points in the history. We haven't sorted out yet how this would be done, because we've been waiting for multiwriter to land so that we fully understand those requirements.
Multiwriter versioning. IIRC maf came up with a way to do this without it being a nightmare, but I can't recall what it was. @mafintosh do you recall what your versioning scheme is going to be? IIRC you had a versioning solution that wasn't a vector.

jwerle · 2019-05-10T17:47:09Z

👋 just seeing this and the last working group notes. I put a little experiment together that tries to define a deep link based on hypercore strong links: https://github.com/jwerle/dat-deep-link
Happy to collaborate on any of this

[Edit] also happy to make the module conform to whatever ends up being the spec

RangerMauve · 2019-05-10T17:58:39Z

Ping @mafintosh :)

martinheidegger · 2019-06-05T14:44:19Z

Reference to the meeting notes: https://github.com/datprotocol/working-group/blob/master/meeting-notes/24-08May2019.md#meeting-notes

Take-aways:

URL compatible
Needs to work for a single-core

pfrazee mentioned this issue Dec 18, 2018

Upcoming Meeting Agenda - 7 November 2018 dat-ecosystem/consortium#36

Closed

6 tasks

chartgerink mentioned this issue Mar 25, 2019

Upcoming Meeting Agenda - 27 March 2019 dat-ecosystem/consortium#48

Closed

6 tasks

bnewbold mentioned this issue Apr 24, 2019

Upcoming Meeting: May 8th, 2019 dat-ecosystem/consortium#51

Closed

6 tasks

martinheidegger mentioned this issue Jun 5, 2019

Consider using ipfs or other existing content addressable stores for tarballs. entropic-dev/entropic#99

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Versions should use hashes #54

Discussion: Versions should use hashes #54

martinheidegger commented Dec 16, 2018 •

edited

Loading

pfrazee commented Dec 17, 2018

jwerle commented May 10, 2019 •

edited

Loading

RangerMauve commented May 10, 2019

martinheidegger commented Jun 5, 2019 •

edited

Loading

Discussion: Versions should use hashes #54

Discussion: Versions should use hashes #54

Comments

martinheidegger commented Dec 16, 2018 • edited Loading

pfrazee commented Dec 17, 2018

jwerle commented May 10, 2019 • edited Loading

RangerMauve commented May 10, 2019

martinheidegger commented Jun 5, 2019 • edited Loading

martinheidegger commented Dec 16, 2018 •

edited

Loading

jwerle commented May 10, 2019 •

edited

Loading

martinheidegger commented Jun 5, 2019 •

edited

Loading