Consider using ipfs or other existing content addressable stores for tarballs. #99

Raynos · 2019-06-01T18:10:17Z

In the readme you mentioned that you want to use a content addressable storage.

There are existing content addressable systems like IPFS that you can leverage.

I’ve recently spoken with IPFS engineers and they are really interested in making IPFS easy to use for package managers so they might be open to implement features you need.

zkat · 2019-06-01T20:49:21Z

zkat/pacote#173 (comment)

IPFS performance is disappointing. I'm not sure it's ready for something like this, tbh.

ceejbot · 2019-06-02T09:40:03Z

IPFS perf is a worry, but as an overall approach I think pluggable backends is good. My next task for the project is to make it possible to store the content blobs in S3 & other object stores so people who have durability requirements (and don't want to deal with backing up disks) can have this option.

fwip · 2019-06-04T02:19:41Z

Dat is also a very good content-addressable system in the JavaScript space, and I think a lot of the work they've done could be helpful for this application. (If not used directly, at least as inspiration / problem solving).

martinheidegger · 2019-06-05T14:26:41Z

I am involved in the DAT community and saying HI! So: DAT is pretty cool for this, but I wouldn't use it ... yet ... because it doesn't solve enough features for a reasonably sized registry. Though that is changing: @andrewosh has progressed far in adding a hypertrie structure holepunchto/hyperdrive#233 (available in a rc-release). The new hyperdrive is tested with a lot of files and a lot of data. (terabytes, pentabyte-test still running)

This makes it an interesting candidate for a decentralized data structure:

Unlike [email protected] dependencies which require server tooling. Links could look like dat://<32bit key> (with domain names optionally: `dat://mydomain.com) which means you don't have to buy a domain to join the fun!
DATs are by definition single-writer which means it both makes sure that the owner doesn't switch magically and that no-one hacks inbetween.
DAT uses a predefined networking stack but that can be easily exchanged for a networking stack of your choice (DAT is nice but hard that way)

But there are good challenges ahead why to not use DAT.

What if someone lost the key to update a DAT or published a malicious library? Having moderation roots that specify namespaces might be a necessary for a reasonable user experience.
Multi-writer isn't here yet. This means only one person is allowed to publish a new version. Not ideal for a package manager.
Proper DAT link versioning is not trivial

philippefutureboy · 2019-06-12T11:20:33Z

Would a protocol like BitTorrent be an interesting technology to support a package registry? It's already widely used for torrenting and seems to be quite performant.

Similarly, as stated in #252, if you are interested in exploring blockchain options I can link a few experts from the community here (Maidsafe, Skycoin, etc.).

Let me know what you think!

tomByrer · 2019-06-16T03:57:06Z

IPFS perf is a worry, but as an overall approach I think pluggable backends is good. My next task for the project is to make it possible to store the content blobs in S3...

Perf issues was my instinct also.

I'm all for 'decentralized' but seems if someone doesn't ensure they're the 'always on +current +connected source,' there can be no certainty of file availability, which can not happen. So there has to be 1 source of truth somewhere. But extra ad-hoc POPs for a CDN-like network is a cool idea, no matter the protocol.

May I suggest contacting jsDelivr for help with this? They built their own routing system that spreads file requests over 4+ CDNs, with backups for the backups. They might be able to host the files though jsDelivr even; they already mirror npm & a chunk of JS on Github.

ghost · 2019-06-16T15:53:46Z

When

IPFS perf is a worry

is stated, is perf short for performance?

The protocol is decentralized, sure, but it is also distributed. Copies of the the files stored in this protocol are automatically distributed which means...

if the main hoster is down, then copies are still available via other peers on the network...

Ideally the file will still be available from anyone else because it is a p2p protocol as well. This essentially allows it to behave like a CDN with redundant backups all over the net.

It's also faster and more efficient because you never download from a single source from a server that may be a considerable distance away from you. Instead, you download from peers that are closest to you and grab those files incrementally from those designated peers.

Downloading from a centralized source has always been slower than downloading from a p2p source.

Sources disappearing has also always been an issue regardless of the protocol or hosting service.

I think it should be allowed if the hoster wants their files to disappear. There's a reason why GDPR was put in to effect and it's because sometimes people want this.

martinheidegger · 2019-06-20T07:16:39Z

I just stumbled on this, using ssb: https://github.com/noffle/ssb-npm-101

hannahhoward · 2019-07-02T16:23:03Z

Hi! Just want to say I'm on the IPFS team and folks over there are discussing what we can do to support you all. You may have seen @andrew linked ipfs-inactive/package-managers#64 -- where we are discussing Entropic.

As someone pretty close the the data transfer aspects of IPFS, I am super concerned about perf and working on it. In theory, it's like @averydark says -- always having the worlds fastest and most redundant CDN at your fingertips -- but in practice, there are a LOT of challenges cause IPFS is a truly distributed global content addressed network, without magnet links and trackers you might have in BitTorrent, and this makes for some complicated problems. So, we will hopefully be exactly what @averydark says eventually, but we're not quite there yet. One upside to having IPFS support is as IPFS gets faster, you get the benefits, and it will get faster, or at least I think it will anyway.

RangerMauve · 2019-07-02T21:07:09Z

Heyo, I'm also coming from the Dat community. I'm currently working on our SDK / developer experience stuff. Package manager is a pretty big deal.

One of my worries about registries is that they'd be have centralized update mechanisms and would be taking control of both indexing / curating packages and storage. Decentralized tech can help a lot with this. Registries can focus more on the curation / search aspect, and users can have more control over the actual updates of their content and can move the storage of their packages as they see fit.

I think the current IPFS registry works as a sort of mirror of NPM / uses IPNS links to people's package history. Tracking down all the pieces that need to be pinned is extra effort in that you need to parse the package metadata to find all the IPFS links that need to be pinned.
Swarming the files inside packages globally seems like it'd be a lot of overhead for packages with many files / IPNS updates are still pretty slow so updating might be a bit of a hassle.
Swarming per file is going to be great for sparsely downloading files from packages, however, since it'll be easier to find peers just for the files you want.

With Dat, you could have a different workflow using the upcoming mounts feature. I wrote a blog post about it earlier last year.
Basically, if you have a package, you can create a Dat archive to keep all its files in one place. Any version metadata can be placed in a file at the root, and you can either have the files directly in it, or have some sort of fancy setup for linking to versions (or something more simple like the manifest IPFS is using).

Then registries become archives which can have folders for package names, which will then mount the archive for the package.
A cool thing about this setup is that updates to any packages or to the registry should propagate fairly quickly, and can reasonably be processed in real-time by listening to the change events. This could be used for all sorts of hooks for doing automated testing / changes / etc.
Keeping copies of a package online is a bit easier, too since you can say "pin this archive" to keep the entire history updated, or "pin this package but only with the latest changes" or "pin this package at this specific version".
Similarly, mirrors of registries can either be pinned sparsely, or pinned fully, or even mounted within other registries to group them together or federate them.

Search indexes could also be stored in dat archvies and distributed over the P2P network so that you wouldn't need a central server to serve the search, it could be processed in one place and propagated over p2p networks using the internet, local wifi, or fancy mesh network setups. It'd also have the advantage of working offline out of the box. Of course this stuff could also be accomplished with IPFS, but again the updates would take more time to propagate, and individually swarming for each file in each archive would be a larger overhead than one swarm per archive / registry.

The mounts feature isn't out yet, so I'd wait a month or two while we integrate it into the SDK. Also, updating a package from multiple devices is still in the design phase. 😅

RangerMauve · 2019-07-04T19:24:22Z

By the way, here's a toy project that @dpaez from @geut recently put together a package manager built on top of Dat as a sort of toy project.

https://github.com/geut/building-up-on-dat/tree/master/packages/gpm

tinchoz49 · 2019-07-04T19:41:52Z

By the way, here's a toy project that @dpaez from @geut recently put together a package manager built on top of Dat as a sort of toy project.

https://github.com/geut/building-up-on-dat/tree/master/packages/gpm

A short demo video from his talk in the NodeConf Colombia: https://streamable.com/l52ba

marcusnewton · 2019-08-04T01:11:23Z

Holochain is a much better approach than IPFS. You can guarantee availability of packages, you can establish shared rules about registering packages, the registry API would be automatically generated for you and a GUI for browsing packages can be served over Holo. The entire thing can be fully distributed with no central servers, without sacrificing performance. Uptime is 100% because you retrieve packages via a DHT, peer to peer. It's the obvious solution IMO

pegaltier · 2019-08-04T11:49:41Z

In addition to @marcusnewton comment. I would like to add another positive thing about Holochain. Actually is built in Rust (the future) and that's the reason why they are late but at the end it will be a very positive argument for the project.

spiralcrew-ou · 2019-08-04T21:33:23Z

This is a git-like protocol created with Holochain. It could be a source of inspiration: https://github.com/uprtcl/hc-uprtcl

yanmaani · 2022-02-08T04:31:40Z

Consider using BitTorrent, which is not a project created as a marketing scheme for cryptocurrency.

ceejbot added enhancement New feature or request registry the public API layer of the backend labels Jun 2, 2019

andrew mentioned this issue Jun 4, 2019

Entropic ipfs-inactive/package-managers#64

Open

superhawk610 mentioned this issue Jun 11, 2019

P2P, Blockchain & Alternative decentralization systems #252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider using ipfs or other existing content addressable stores for tarballs. #99

Consider using ipfs or other existing content addressable stores for tarballs. #99

Raynos commented Jun 1, 2019

zkat commented Jun 1, 2019

ceejbot commented Jun 2, 2019

fwip commented Jun 4, 2019

martinheidegger commented Jun 5, 2019

philippefutureboy commented Jun 12, 2019

tomByrer commented Jun 16, 2019 •

edited

Loading

ghost commented Jun 16, 2019

martinheidegger commented Jun 20, 2019

hannahhoward commented Jul 2, 2019

RangerMauve commented Jul 2, 2019

RangerMauve commented Jul 4, 2019

tinchoz49 commented Jul 4, 2019

marcusnewton commented Aug 4, 2019

pegaltier commented Aug 4, 2019

spiralcrew-ou commented Aug 4, 2019

yanmaani commented Feb 8, 2022

Consider using ipfs or other existing content addressable stores for tarballs. #99

Consider using ipfs or other existing content addressable stores for tarballs. #99

Comments

Raynos commented Jun 1, 2019

zkat commented Jun 1, 2019

ceejbot commented Jun 2, 2019

fwip commented Jun 4, 2019

martinheidegger commented Jun 5, 2019

philippefutureboy commented Jun 12, 2019

tomByrer commented Jun 16, 2019 • edited Loading

ghost commented Jun 16, 2019

martinheidegger commented Jun 20, 2019

hannahhoward commented Jul 2, 2019

RangerMauve commented Jul 2, 2019

RangerMauve commented Jul 4, 2019

tinchoz49 commented Jul 4, 2019

marcusnewton commented Aug 4, 2019

pegaltier commented Aug 4, 2019

spiralcrew-ou commented Aug 4, 2019

yanmaani commented Feb 8, 2022

tomByrer commented Jun 16, 2019 •

edited

Loading