Conversation
db1735d to
3c2bcb9
Compare
|
Trying this out, it works against a saved capture but soon after starting live capture I'm getting this panic: |
f65fc8f to
1444ebb
Compare
|
OK, this actually required a bunch more work. Previously, I was just letting the However, when we fetch an item we also fetch its child count, and that was screwing things up. So I've made |
|
I'm getting a new error now 🙂 |
|
OK, I can reproduce that. Not sure what's going on yet. It goes away if I revert the last commit on this branch which adds that extra verification, but that then leads to different crashes later, so clearly something's still not using the right numbers. |
317f04c to
cdad50f
Compare
|
I've fixed several routes for state from the live capture to slip into the snapshot via reuse of things from |
dfc6949 to
52d1dff
Compare
|
I've made some progress on debugging this, and I have a workaround, though not yet a proper fix. The crash was happening when When we first call However, when we update a description because something about the item has changed, we weren't doing that. So the workaround is just: if item_updated {
// The node's description may change.
- let summary = cap.description(&item_node.item, false)?;
+ let summary = match cap.description(&item_node.item, false) {
+ Ok(description) => description,
+ Err(e) => format!("Error: {e}"),
+ }With that error handling, everything works fine from the user's perspective, and I've not seen any errors actually become visible in the UI. However, adding some further code to log the errors, along with the previous description of the items in question, reveals what's failing: The failed lookup is into the |
Previously, when the requested range was present in the index, the target_length parameter was irrelevant and not needed. However, we now want to be able to pass in the length of a snapshot, which may have a shorter length than the full target stream.
|
I found the bug! The problem was with the way I was implementing the I was forwarding this to the However, the The fix is in commit a0712bf. I've kept the change to error handling on item updates, but removed the subsequent debug code. This should now be ready to go. |
Rather than trying to work with live data from the database during UI updates, take a snapshot of the database state and then work with that snapshot until the next update. This eliminates a lot of potential for non-determinism that's otherwise hard to test or reproduce.
The challenge is to make the snapshotting cheap; this can be achieved by grouping all the
AtomicU64counters used by the database into a contiguous block of memory which serves as the state representation, and which can be snapshotted effectively with what is essentially just amemcpyby the writing thread, in between handling packets.There's a lot of API churn in this PR, for two reasons:
Counterrequires aCounterSet, theDumpAPI is changed such thatrestorerequires adb: &mut CounterSetargument.CaptureReaderandCaptureSnapshotReadertypes that both implement aCaptureReaderOpstrait. However, this means every single call site has to change from fields to methods, e.g.cap.packet_index.len()->cap.packet_index().len().Fixes some crashes in #150, and some much more occasional crashes in mainline Packetry.