August 14th, 2019 via IRC (#datprotocol)
- rangermauve
- mafintosh
- pfrazee
- jhand
- noffle
- andrewosh
- cblgh
- Catch up with what everyone has been doing
- Review last meeting/action items
- discuss exposing more detailed "replication progress" information
- Talk about datprotocol.com and dat website updates.
- talk about multiwriter. Karissa and some people from cobox put together this: https://github.com/coboxcoop/peerfs which seems to be working.
- pfrazee
- Been working on the dat2 integration with beaker. That's going well and waiting on a few more upstream features before I can merge it into blue
- Also been working on blue's sidebar tools - the editor, webterm, things like that - and just trying to wrap up the overall experience for blue
- rangermauve
- I've made progress on figuring out how to use Corestore. Met with Andrew and telemohn
- Currently working on hyperswarm-proxy which is like discovery-swarm-stream but for hyperswarm
- Hopefully I'll have the new networking and hyperdrive in the SDK this month
- noffle
- I've been mostly working on Mapeo. Specifically, on "sync progress". Since some of our partners have very large datasets (150k+ records) and often slow computers and/or slow phones, sync can take 10+ minutes. Having the progress of sync reported is very helpful, to see how long it will take, and whether it's gotten stuck
- also been thinking about some higher level interfaces on top of kappa-core
- andrewosh
- making a series of updates to the daemon to support pfrazee. adding diff/history/download and a handful of bug fixes and improvements
- finished the work on the multithread fuse module -- integrating that into the daemon, since the API has changed slightly, is still todo
- been designing a hypertrie change that will simplify the implementation of mounts, symlinks, and will allow for atomic renames. if it works, it will also let us get rid of the mountable-hypertrie module
- 3 is still as early as it gets. still coming up with the right abstractions. importantly it won't require any data structure change
- jhand
- Been pretty focused on CS&S work lately and also took a nice week to visit some big trees. https://oregonwild.org/forests/oregons-ancient-forests-hiking-guide
- Working with Gina a bit to go through existing dat content and create a new sitemap, plan for finishing up MOSS
- benchmarking lots of hypercores
- rangermauve: Did end up having time to do benchmarking?
- noffle: no, I think me and maf talked and we ended up /w a TOOD on his list? need to check my chat history..
- rangermauve: Fair enough! We'll figure it out later.
- cblgh: https://gist.github.com/cblgh/10fd8677f43ec34c0bf964851031c933 the history you were looking for
- noffle: yeah me and maf had a PM convo after that
- noffle: update on the benchmarking: I PM'd with mafintosh, and gave him numbers, but it didn't go anywhere re actionables
- rangermauve: Fair enough. We can figure it out after he gets back from vacation, then.
- ACK in hypercore protocol
- rangermauve: I think maf and substack figured out the ACK stuff and it's been landed in hypercore, which is cool
- extensions
- pfrazee: on the extensions thing, dunno if you heard but new replication protocol will allow extensions to be set dynamically
- rangermauve: Oh! Mafintosh mentioned that last time, but I wasn't sure what the timeline on that was. Is there a PR for it?
- pfrazee: not yet, it's part of the ongoing update to the caps system
- rangermauve: Is there a branch for the caps system I could follow?
- andrewosh: the protocol will support it once the cap system is there, but that's actually getting very close (lemme find the branch). maybe it's not pushed yet actually. will need to ping mafintosh
- rangermauve: That's awesome! Dynamic extensions will help a ton, I think
- andrewosh: ah and rangermauve's extension APIs were merged into 10 as well!
- More detailed replication progress information:
- noffle: like I was talking about above, having information about progress of a sync is very useful (non-live sync, in our case). we want to track uploads and downloads but on a per-session basis. hypercore doesn't really expose that information right now
- rangermauve: What are you currently using for progress tracking? I think I saw Karissa post the "hyperhealth" module before https://github.com/okdistribute/hyperhealth/
- noffle: like, knowing at the start of sync (ok, I have N blocks to up, and M blocks to down). ad-hoc'd something using hypercore's "download" event. we aren't tracking uploads yet. hyperhealth looks like something different?
- andrewosh: by per-session, do you just mean "what's been uploaded/downloaded for a given peer, for the current process?"
- noffle: no, rather: what is to be uploaded and downloaded. so, two peers connect and they figure out what blocks need to go in each direction. exposing THAT would be nice. and then tracking the progress of that. that's basically what I'm slowly trying to build, but it'd be great if hypercore could offer that; it seems like a generally useful tool
- rangermauve: So, I think some of this data should be available in the Peer object inside hypercore, but I think it's not very documented and I'm not even sure if the data is there. 😅 https://github.com/mafintosh/hypercore/blob/master/lib/replicate.js#L65
- jhand: ya we've gone down this hole a few times and not sure it ever has been solved or is possible. see joehand/dat-push#9 https://github.com/dotloom/darp2/blob/3ececfbf76aa3f5f5c913659d69aa48460fd71f6/util/trackUpload.js#L13-L66
- noffle: thanks I'll check those out
- rangermauve: The comments indicate that 'hypercore might not announce that the uploaded data has been received'. I think the new ACK feature changes that
- jhand: I will make an issue in the forum since it comes up a lot in many places =)
- rangermauve: Actually, noffle would you mind opening an issue in Hypercore or pinging me if one exists? It'd be nice to figure out a dream API for how this stuff should be reported and then figuring out how to implement it
- noffle: sync progress? can do
- rangermauve: I'd really like to get this for the SDK, too and I think I'll have bandwidth to contribute in that direction
- mafintosh: could you open an issue explaining the usecase of why you need something like that exposed? Prob makes a bit easier to reason about
- noffle: yes! for mapeo we have desktop<->mobile devices syncing (non-live) map data eith each other. often it's a LOT of data and it can take many minutes (10+ to sync it all), as they are often old phones and laptops. progress info makes it more clear to users how long the sync will take, as well as indicating that the sync isn't stuck/hanging. I'll open an issue too though!
- mafintosh: ah makes sense. was missing the upload context
- website updates
- jhand: Ya so Gina has been doing a content audit of all the existing Dat websites and #content. With the aim of improving discoverability, making maintenance and updating easier, and centralizing things =)
- rangermauve: That's awesome! I honestly didn't know some of the things she found existed. :P
- jhand: This map shows a proposal for a new sitemap https://user-images.githubusercontent.com/684965/63033982-6105e600-be6d-11e9-958e-2ba85e8e438f.png One of the big questions in my mind is the datprotocol.com site and related content (DEPs, how dat works, dat protocol book)
- rangermauve: What sort of questsions? Like, where it should go?
- pfrazee: I feel like keeping datprotocol.com with its current content and role might be smart. but the other ones (how dat works, dat protocol book) could easily move
- jhand: Yes, should it continue being a separate domain/content? Should the subcontent (how dat works, dat protocol book) be moved? Should branding be consistent across both or remain separate?
- rangermauve: Personally, I'd like to see that stuff in the docs repo. I'm not totally sure if DEPs should be there, but I think it'd help discoverability a lot. I have no idea how I found DEPs in the first place, and it felt like I stumbled across a forbidden source of knowledge
- jhand: Anyways, we dont need to solve this now. Gina will open up for discussion more broadly this week. But I just wanted to the the WG eyes on it and any initial reactions
- rangermauve: What's a good reason to not merge them into docs? The web parts, I mean
- jhand: What are the web parts? ;)
- rangermauve: Like, I don't think DEPs should be stored in the docs repo as the canonical source. I like having them in a separate repo. But I'd like to see them added as folders in the docs repo somehow. Similarly with "how dat works"
- jhand: The benefit of having datprotocol.com and related content is they are all someone self-contained and (somewhat) independent of implementation. So it may be nice to keep having that as a source for learning about dat protocol.
- rangermauve: Cool. Yeah, I think it'd be nice to have a call with Gina with whoever is interested in that stuff
- jhand: I will share link here when its up and anyone who has opinion please voice it at the time =)
- peerfs
- rangermauve: Writing to an archive from multiple sources is probably the biggest pain point in Dat and I've spoken to dozens of people who don't want to use it because of that. cobox is actively working on a multiwriter dat-tpe thing and I was wondering what the WG thinks about collaborating with them to bring it in as a "standard" multiwriter in the ecosystem. pfrazee and mafintosh in particular.
- pfrazee: you gotta talk to mafintosh about this. Last I talked to him, he was saying there are some complexities to layering a CRDT onto hyperdrive that make sparse replication problematic. I havent had a chance to find out the details on that though
- rangermauve: So, what about having an initial version that's imperfect that "works" and then releasing a breaking change for a higher performance and reliability version?
- andrewosh: do these approaches to multiwriter allow for sparse metadata syncing? wondering what the scaling properties are
- rangermauve: The way peerfs works right now is it uses kappa-core to have a list of hyperdrives that gets shared between peers and it uses
stat
to get the "latest" version. https://github.com/coboxcoop/peerfs/blob/master/index.js#L85 - andrewosh: got it, so only a subset of the complete metadata feeds are used to build the materialized view?
- rangermauve: It'd essentially be whatever metadata they load from hypertrie when they do a
readdir
andstat
- rangermauve: So, the thing is that they're going to be coming out with this and people are going to be using it for multi-writer archives. The question is whether we'll be able to load these archives in Beaker and the CLI. Anyway, it's not something we need to figure out today, but it'd be good to have that in the back of our heads for next time
- Noffle will open an issue about download progress API in hypercore
- mafintosh will catch up and follow up about peerfs
- everyone will think about the new redesign