-
-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using ipfs or other existing content addressable stores for tarballs. #99
Comments
IPFS performance is disappointing. I'm not sure it's ready for something like this, tbh. |
IPFS perf is a worry, but as an overall approach I think pluggable backends is good. My next task for the project is to make it possible to store the content blobs in S3 & other object stores so people who have durability requirements (and don't want to deal with backing up disks) can have this option. |
Dat is also a very good content-addressable system in the JavaScript space, and I think a lot of the work they've done could be helpful for this application. (If not used directly, at least as inspiration / problem solving). |
I am involved in the DAT community and saying HI! So: DAT is pretty cool for this, but I wouldn't use it ... yet ... because it doesn't solve enough features for a reasonably sized registry. Though that is changing: @andrewosh has progressed far in adding a hypertrie structure holepunchto/hyperdrive#233 (available in a rc-release). The new hyperdrive is tested with a lot of files and a lot of data. (terabytes, pentabyte-test still running) This makes it an interesting candidate for a decentralized data structure:
But there are good challenges ahead why to not use DAT.
|
Would a protocol like BitTorrent be an interesting technology to support a package registry? It's already widely used for torrenting and seems to be quite performant. Similarly, as stated in #252, if you are interested in exploring blockchain options I can link a few experts from the community here (Maidsafe, Skycoin, etc.). Let me know what you think! |
Perf issues was my instinct also. I'm all for 'decentralized' but seems if someone doesn't ensure they're the 'always on +current +connected source,' there can be no certainty of file availability, which can not happen. So there has to be 1 source of truth somewhere. But extra ad-hoc POPs for a CDN-like network is a cool idea, no matter the protocol. May I suggest contacting jsDelivr for help with this? They built their own routing system that spreads file requests over 4+ CDNs, with backups for the backups. They might be able to host the files though jsDelivr even; they already mirror npm & a chunk of JS on Github. |
When
is stated, is perf short for performance? The protocol is decentralized, sure, but it is also distributed. Copies of the the files stored in this protocol are automatically distributed which means... if the main hoster is down, then copies are still available via other peers on the network... Ideally the file will still be available from anyone else because it is a p2p protocol as well. This essentially allows it to behave like a CDN with redundant backups all over the net. It's also faster and more efficient because you never download from a single source from a server that may be a considerable distance away from you. Instead, you download from peers that are closest to you and grab those files incrementally from those designated peers. Downloading from a centralized source has always been slower than downloading from a p2p source. Sources disappearing has also always been an issue regardless of the protocol or hosting service. I think it should be allowed if the hoster wants their files to disappear. There's a reason why GDPR was put in to effect and it's because sometimes people want this. |
I just stumbled on this, using ssb: https://github.com/noffle/ssb-npm-101 |
Hi! Just want to say I'm on the IPFS team and folks over there are discussing what we can do to support you all. You may have seen @andrew linked ipfs-inactive/package-managers#64 -- where we are discussing Entropic. As someone pretty close the the data transfer aspects of IPFS, I am super concerned about perf and working on it. In theory, it's like @averydark says -- always having the worlds fastest and most redundant CDN at your fingertips -- but in practice, there are a LOT of challenges cause IPFS is a truly distributed global content addressed network, without magnet links and trackers you might have in BitTorrent, and this makes for some complicated problems. So, we will hopefully be exactly what @averydark says eventually, but we're not quite there yet. One upside to having IPFS support is as IPFS gets faster, you get the benefits, and it will get faster, or at least I think it will anyway. |
Heyo, I'm also coming from the Dat community. I'm currently working on our SDK / developer experience stuff. Package manager is a pretty big deal. One of my worries about registries is that they'd be have centralized update mechanisms and would be taking control of both indexing / curating packages and storage. Decentralized tech can help a lot with this. Registries can focus more on the curation / search aspect, and users can have more control over the actual updates of their content and can move the storage of their packages as they see fit. I think the current IPFS registry works as a sort of mirror of NPM / uses IPNS links to people's package history. Tracking down all the pieces that need to be pinned is extra effort in that you need to parse the package metadata to find all the IPFS links that need to be pinned. With Dat, you could have a different workflow using the upcoming Then registries become archives which can have folders for package names, which will then mount the archive for the package. Search indexes could also be stored in dat archvies and distributed over the P2P network so that you wouldn't need a central server to serve the search, it could be processed in one place and propagated over p2p networks using the internet, local wifi, or fancy mesh network setups. It'd also have the advantage of working offline out of the box. Of course this stuff could also be accomplished with IPFS, but again the updates would take more time to propagate, and individually swarming for each file in each archive would be a larger overhead than one swarm per archive / registry. The mounts feature isn't out yet, so I'd wait a month or two while we integrate it into the SDK. Also, updating a package from multiple devices is still in the design phase. 😅 |
By the way, here's a toy project that @dpaez from @geut recently put together a package manager built on top of Dat as a sort of toy project. https://github.com/geut/building-up-on-dat/tree/master/packages/gpm |
A short demo video from his talk in the NodeConf Colombia: https://streamable.com/l52ba |
Holochain is a much better approach than IPFS. You can guarantee availability of packages, you can establish shared rules about registering packages, the registry API would be automatically generated for you and a GUI for browsing packages can be served over Holo. The entire thing can be fully distributed with no central servers, without sacrificing performance. Uptime is 100% because you retrieve packages via a DHT, peer to peer. It's the obvious solution IMO |
In addition to @marcusnewton comment. I would like to add another positive thing about Holochain. Actually is built in Rust (the future) and that's the reason why they are late but at the end it will be a very positive argument for the project. |
This is a git-like protocol created with Holochain. It could be a source of inspiration: https://github.com/uprtcl/hc-uprtcl |
Consider using BitTorrent, which is not a project created as a marketing scheme for cryptocurrency. |
In the readme you mentioned that you want to use a content addressable storage.
There are existing content addressable systems like IPFS that you can leverage.
I’ve recently spoken with IPFS engineers and they are really interested in making IPFS easy to use for package managers so they might be open to implement features you need.
The text was updated successfully, but these errors were encountered: