-
Notifications
You must be signed in to change notification settings - Fork 224
Open
Description
When there is a slow fsync, and the node is terminated before the fsync completes, the uv worker thread running the fsync may access deleted objects when the fsync completes.
2025-06-19T16:14:49.026392Z 100 [info ] src/snapshots/snapshot_manager.h:198 | New snapshot file written to snapshot_46_47 [189779 bytes] (unsynced)
2025-06-19T16:14:49.028663Z 100 [info ] src/snapshots/snapshot_manager.h:111 | Start fsync
2025-06-19T16:14:49.028693Z -0.016 0 [trace] /ccf/src/node/history.h:487 | mt_flush_to index=48
2025-06-19T16:14:49.028770Z -0.016 0 [trace] /ccf/src/node/history.h:91 | History [3] <sha256 100d1c49088435484be4b4202a124d8c41f1d8d4f8a36078e88eea41f71f9e79>
2025-06-19T16:14:49.028838Z 100 [debug] /ccf/src/host/ledger.h:1489 | Ledger commit: 48/48
2025-06-19T16:14:49.028961Z 100 [debug] /ccf/src/host/ledger.h:698 | Committed ledger file ledger_47-48.committed
2025-06-19T16:14:49.029085Z -0.016 0 [debug] /ccf/src/consensus/aft/raft.h:2373 | Commit on n[689f10e149371dc149addf8fc097736b8de4fe307623d3e77e15b79ddf1fcb09]: 48
2025-06-19T16:14:49.029156Z -0.016 0 [debug] /ccf/src/enclave/rpc_sessions.h:540 | Closing a session inside the enclave: 27
2025-06-19T16:14:49.029222Z 100 [debug] /ccf/src/host/rpc_connections.h:417 | rpc closed from enclave 27
2025-06-19T16:14:49.029356Z -0.016 0 [info ] /ccf/src/enclave/enclave.h:453 | Enclave stopped successfully. Stopping host...
2025-06-19T16:14:49.029433Z 100 [info ] cf/src/host/handle_ring_buffer.h:100 | Host stopped successfully
2025-06-19T16:14:49.029502Z 100 [info ] /ccf/src/host/main.cpp:992 | Exited event loop
2025-06-19T16:14:49.030333Z 100 [info ] /ccf/src/host/main.cpp:1011 | Ran an extra 1000 cleanup iteration(s)
2025-06-19T16:14:49.030596Z 100 [info ] /ccf/src/host/main.cpp:1017 | Failed to close uv loop, walking now
2025-06-19T16:14:49.030657Z 100 [fail ] /ccf/src/host/main.cpp:1019 | Failed to close uv loop cleanly: EBUSY
The right solution here is probably for the fsync'ing worker thread to notify the main thread that it has completed, such that we can shutdown cleanly.
This will probably look like setting some global register on the main thread, checking that in the completion callbacks on worker threads and ensuring they do not enqueue more work if it is set.
Then the main thread can simply run until completion.
Metadata
Metadata
Assignees
Labels
No labels