-
Notifications
You must be signed in to change notification settings - Fork 224
Open
Description
Following up discussion in #6968, I noticed a surprising duplication of ledger files:
they create another copy of the ledger in their
.current
directories, even though they already have it in their.committed
directories!
Investigating that eventually led to #3064, which suggests this is deliberate and expected behaviour:
The consequence of this is that, when a node is started without a snapshot but with access to another ledger (e.g. via --read-only-ledger-dir), both directories will contain some identical ledger files (since the node will have to replay all historical transactions).
I think this is confusing, on top of being obviously inefficient and expensive.
I'd like to explore a few things around this:
- Confirm whether we should be calling
init_as_backup()
(and its resulting rollback) when joining without a snapshot. There are still several comments saying these paths should occur after a snapshot is applied. - When initialising views and rolling back to 0 on this joiner, we should truncate all files from the
.current
directory. We're somehow in an odd situation where we claim to have done a rollback, but we haven't actually truncated the ledger files, leading to confusion when the replay reaches this point (see the original bug and fix). - Try to make
ledger_append
use the existing (read-only) file where possible. Somewhere we decide we need a writeable file, so create a full copy, but deep insidewrite_entry()
we already check if this is an entry we already hold, in which case this is an idempotent nop. We should strive for the latter behaviour. - Can we replay the state from these ledger directories, as part of the join process? We'd replay if we had a snapshot, and we have to confirm through the consensus protocol where we agree with the leader, but currently we always report that we're at 0 and receive the full history over consensus. We should be able to do some partial recovery-like replay process to reach a more helpful initial state.
- What happens when the read-only directory contains conflicting files, or a dead suffix? We have suggested operator processes to avoid this, but it's possible storage is temporarily unavailable (so recovery completes on a shorter ledger), and then returns (so a joiner has a longer ledger from the origin service, now read-only).
Metadata
Metadata
Assignees
Labels
No labels