-
Notifications
You must be signed in to change notification settings - Fork 203
Differentiate between Consensus and Cluster Headers storage #8222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Weakens the chainID requirement for cluster chains when reading from storage.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
module/builder/collection/builder.go
Outdated
|
|
||
| for _, blockID := range clusterBlockIDs { | ||
| header, err := b.clusterHeaders.ByBlockID(blockID) | ||
| header, err := b.clusterHeaders.ByBlockID(blockID) // TODO(4204) transaction deduplication crosses clusterHeaders epoch boundary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transaction de-duplication actually does not occur across cluster and epoch boundaries.
- Each transaction is uniquely assigned to one cluster in one epoch, based on the transaction's reference block (see ingestion logic)
- Therefore, each cluster has a range of reference block heights they can accept. These ranges are equivalent to the height range of blocks within an epoch (
$[FirstBlockInEpoch.Height, LastBlockInEpoch.Height]$ . These ranges are consecutive and do not overlap.- In short, if we are considering a cluster block with reference block height
$FirstBlockInEpoch.Height$ , thenminRefHeightis actually$FirstBlockInEpoch.Height$ (we don't need to search further back).
- In short, if we are considering a cluster block with reference block height
- We already take this into account when determining the lowest possible reference block
So I think we can remove this TODO, and remove the special-case logic in storage.Headers meant to work around this. I would also suggest adding some documentation here explaining why there is no overlap between clusters and epochs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for clarifying this explicitly. Now that I understand better what's going on, I can say that while you are right that that's what should be happening, it is not, due to two bugs 🐞🐞. The existing behaviour did actually look across the epoch boundary (now causing clustover_switchover_test to fail), because:
- 🐞
ctx.refEpochFirstHeightis never initialized in the block builder, so defaults to 0. (Introduced in Enforce that collection reference blocks are bound to the cluster's operating epoch #4148 and not caught in Removes unused first height field from LN builder #6828)
Essentially, theminRefHeightdid not get clamped to the start of the epoch, and was alwaysLastFinalizedBlockInEpoch.Height - DefaultTransactionExpiry. - In addition, the minimum height of the range is decreased twice - once in
lowestPossibleReferenceBlockHeight()(respecting the epoch boundary if not for bug 1) and then again infindRefHeightSearchRangeForConflictingClusterBlocks(🐞 not respecting the epoch boundary).
I believe this means that this particular lookup was always checking approximately the past ~1200 reference heights (flow.DefaultTransactionExpiry * 2), regardless of which epoch those heights were in.
While the first bug also affects payload construction, I don't think it has impacted correctness (essentially transactions from the previous epoch would not immediately be considered expired, but they are already split into pools by epoch anyways as you have noted, so transactions from a different epoch will not be encountered.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Oops, that was introduced by me 🫣.
Here's what I think we should do:
- bring back the logic that populates
refEpochFirstHeight- Check for
ErrNotFound-- if we see that error, then setrefEpochFirstHeightto zero. This means that we have just joined the network, and that our local state cutoff (our root block) is newer thanrefEpochFirstHeight.
- Check for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in e397124 - does the comment on expected behaviour from setting refEpochFirstHeight = 0 match with what you expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double-check: the minimum reference height to check for a duplicated transaction is approximately finalizedHeight - 2*DefaultTransactionExpiry, because of the following situation:
Collection C1 C2
┏━━━━━━━━━━━┫
Transaction T1 T2
┊ ┊
Reference Block R1──────────R2──────────Head
<-Expiry-> <-Expiry->
The new collection C2 could barely include transaction T2; to ensure we deduplicate correctly, we need to check collection C1, which has reference block R1 (because it could just barely include transaction T1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tim-barry I would go off this function and this documentation.
When we are inserting C2, the range we need to check is [C2MinRefHeight-Expiry, C2MaxRefHeight]
C2MinRefHeight is both the reference block height of C2, and the smallest reference block height of all transactions in C2. C2MaxRefHeight is the largest reference block height of all transactions in C2.
The check was unintentionally crossing an epoch boundary and retrieving headers from a previous cluster chain. In addition, `ctx.refEpochFirstHeight` was never initialized. See #8222 (comment) for details
storage/store/headers.go
Outdated
| if header.ChainID != chainID { | ||
| return fmt.Errorf("expected chain ID %v, got %v: %w", chainID, header.ChainID, storage.ErrWrongChain) | ||
| } | ||
| if chainID.IsClusterChain() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this check? From what I see, operation.InsertHeader already ensures that we hold a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we now know which kind of chain we are attempting to insert a Header for, we can be more granular about the specific lock required, and ensure we don't accidentally insert a main chain Header while only holding storage.LockInsertOrFinalizeClusterBlock, or vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on this discussion:
we can be more granular about the specific lock required
I agree. Though, I think this finer granularity should live in the lowest level possible. Thereby, we guarantee that all code paths must go through the fine-grained check.
I would suggest to differentiate on the level of the storage methods: i.e. split InsertHeader into two more specialized methods.
…D/storage init preInitFns can contain DynamicStartPreInit, which affects how the root snapshot is loaded. Since ChainID is read from the root snapshot to initialize storage if the node is not bootstrapped, it should come after preInit functions.
AlexHentschel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First batch of comments. Still reviewing ...
model/flow/chain.go
Outdated
| // IsClusterChain returns whether the chain ID is for a collection cluster during an epoch, rather than a full network. | ||
| func (c ChainID) IsClusterChain() bool { | ||
| return strings.HasPrefix(string(c), "cluster") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would highly discouraged to use string directly, hardcoded in the implementation. Up to this point, a cluster's chain ID could be entirely arbitrary. Hence using just "some string" was fine. That it started with cluster was solely for human readability. This is changing now, so we want to enforce that the same uniform convention is used everywhere in the code base.
You are establishing an archetype of an implementation pattern here, that other people might replicate. Please design the code such that
- It is easy to verify that the code uses the same implementation everywhere. Your code should be very expressive that this is a binding convention, which must be followed everywhere.
- Updating the convention is easy and has low probability of introducing bugs (this should automatically follow from a good implementation satisfying 1. So in a way, you can use 2. as a self-check for your own implementation whether your code satisfies 1. well).
With the current pattern, I see the risk that strings cluster appear in multiple different locations in the code. Hence, when changing the convention, we risk that once of the implementations is forgotten, creating a bug.
Suggestions:
- Your implementation should convey (in code and documentation): this is a binding naming convention, that must be followed consistently!
- Introducing a constant (e.g.
ClusterIDPrefix) helps, seeMainnetconstant for an example. - Diligently documenting the naming conventions as part of the constant's godoc would help a lot in my opinion. Then for every place using that constant, it is clear that they should be applying the same convention
- Introducing a constant (e.g.
- We want to keep the code that checks whether something is a cluster chain and the code that generates the cluster chain ID close to each other. This helps engineers reviewing or working with the code tremendously to see the connection.
- function
IsClusterChainandCanonicalClusterIDshould live directly next to each other - This is cluster specific, so my preference would be putting the constant
ClusterIDPrefixand the functionIsClusterChaininto the packagestate/cluster, right next toCanonicalClusterIDandCanonicalRootBlock, which must also follow this convention. Unfortunately, I suspect this would create a circular dependency. If that is the case, I would suggest to put the constantClusterIDPrefix, the generating functionCanonicalClusterIDand the checking functionIsClusterChainall into the filemodel/flow/cluster.go
- function
- Please create a unit test that verifies the convention:
CanonicalClusterIDgenerates chain IDs that start with the constantClusterIDPrefixIsClusterChainaccepts those chain IDs generates byCanonicalClusterIDIsClusterChainrejects all identifiers that we use for the main chains of different networks (seeAllChainIDs)
- I would suggest to make the check inside
IsClusterChainas tight as possible. The convention currently is:you could specifically check with a regular expression for precisely the expected structure (integer for the epoch followed by hex string)flow-go/state/cluster/root_block.go
Line 13 in 502f81c
return flow.ChainID(fmt.Sprintf("cluster-%d-%s", epoch, participants.ID()))
storage/store/headers.go
Outdated
| if chainID.IsClusterChain() { | ||
| panic("NewHeaders called on cluster chain ID - use NewClusterHeaders instead") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using panics in production code is almost universally discouraged. We tolerate (still discourage) it only for struct-internal sanity checks that the struct itself should guarantee never fail. It should be documented why this panic should never happen and details on how the struct guarantees this.
The this constructor, I would like to request that an error (you can decide whether you want to throw an irrecoverable.exception or a generic error or a typed error) is returned instead of a panic, because we are dealing with an external input.
The reason is that errors quite nicely preserve the call stack of where the error happened. In contrast, the panic just terminates the program without much information which call stack led to the panic ... which makes debugging very cumbersome.
Suggestion:
- I feel this PR has already quite a large change surface. You could include a TODO, indicating that this panic will be changed to an error return later.
- Please also emphasize this in your PR description, specifically referencing the affected relevant code locations. This manages the expectation of reviewers of this PR. AND it helps reviewers of your subsequent PR to verify that really all panics have been replaced by error returns.
storage/store/headers.go
Outdated
| if !chainID.IsClusterChain() { | ||
| panic("NewClusterHeaders called on non-cluster chain ID - use NewHeaders instead") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my prior comment and suggestion to replace the panic by an error return
| err := operation.RetrieveHeader(r, blockID, &header) | ||
| return &header, err | ||
| // It supports storing, caching and retrieving by block ID, and additionally indexes by header height and view. | ||
| func NewHeaders(collector module.CacheMetrics, db storage.DB, chainID flow.ChainID) *Headers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the requirements on the chainID input - I would reference the constant ClusterIDPrefix for further reading.
| // It supports storing, caching and retrieving by block ID, and additionally an index by header height. | ||
| func NewClusterHeaders(collector module.CacheMetrics, db storage.DB, chainID flow.ChainID) *Headers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the requirements on the chainID input - I would reference the constant ClusterIDPrefix for further reading.
| // ByBlockID returns the header with the given ID. It is available for finalized blocks and those pending finalization. | ||
| // Error returns: | ||
| // - [storage.ErrNotFound] if no block header with the given ID exists | ||
| // - [storage.ErrWrongChain] if the block header exists in the database but is part of a different chain than expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be documented as part of the Headers interface. Please make sure you always keep the interface and implementation documentation consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like the error return for the method:
flow-go/storage/store/headers.go
Lines 182 to 188 in e397124
| // ByBlockID returns the header with the given ID. It is available for finalized blocks and those pending finalization. | |
| // Error returns: | |
| // - [storage.ErrNotFound] if no block header with the given ID exists | |
| // - [storage.ErrWrongChain] if the block header exists in the database but is part of a different chain than expected | |
| func (h *Headers) ByBlockID(blockID flow.Identifier) (*flow.Header, error) { | |
| return h.retrieveTx(blockID) | |
| } |
!
However, there are other methods that are conceptually very similar, whose implementations are still completely oblivious about the separation of cluster and main blocks:
Lines 29 to 31 in c1c435a
// Exists returns true if a header with the given ID has been stored. // No errors are expected during normal operation. Exists(blockID flow.Identifier) (bool, error) Lines 38 to 45 in c1c435a
// ByParentID finds all children for the given parent block. The returned headers // might be unfinalized; if there is more than one, at least one of them has to // be unfinalized. // CAUTION: this method is not backed by a cache and therefore comparatively slow! // // Expected error returns during normal operations: // - [storage.ErrNotFound] if no block with the given parentID is known ByParentID(parentID flow.Identifier) ([]*flow.Header, error) Lines 47 to 51 in c1c435a
// ProposalByBlockID returns the header with the given ID, along with the corresponding proposer signature. // It is available for finalized blocks and those pending finalization. // Error returns: // - [storage.ErrNotFound] if no block header or proposer signature with the given blockID exists ProposalByBlockID(blockID flow.Identifier) (*flow.ProposalHeader, error)
In all cases, asking the cluster-bound Headers for a consensus block should return ErrWrongChain and vice versa.
Please keep the documentation of the interface and implementation consistent and include tests confirming that the documented error type is returned by the implementation as expected.
Related request
Currently, the interface documentation is too concise in my opinion, as it does not provide a broader picture
Lines 7 to 8 in c1c435a
| // Headers represents persistent storage for blocks. | |
| type Headers interface { |
Please extend the interface documentation:
- Point out that in general, multiple instances might exist in parallel inside a single node. Explain it on the example of the collectors storing headers of their own cluster consensus as well as the main consensus. Emphasize that care should be taken to interact with the correct instance, because otherwise the implementation will return
[storage.ErrWrongChain] - Emphasize that implementations for the cluster consensus do not yet support lookups by view.
- add a dedicated sentinel (e.g.
ErrNotAvailableForClusterConsensus) - document that this sentinel might be returned by the
ByViewmethod. - check error type in test
- add a dedicated sentinel (e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleas add test cases confirming that the expected sentinel error types are returned for all relevant methods. Rule of thumb:
- Adding a new sentinel as a possible error as a possible return to some method? Add a test case confirming that exactly that type is produced by the implementation.
- Adding a new exception / generic error as a possible return to some method? Add a test case confirming that the function errors for the expected conditions, and the implementation does not misrepresent the exception with a documented sentinel indicating a benign error case.
This distinction allows more granularity with which locks are required. Also similarly split up definition of the storeWithLock functor used by Header storage.
Generation and checking for cluster ChainIDs are now next to each other, currently in the `cluster/state` package (instead of the `model/flow` package where the ChainID type and standard chainIDs are defined). Switched to testing the full chainID with a regex instead of just the prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next batch of comments. Still reviewing.
Concerns about function GetChainIDFromLatestFinalizedHeader and GetLatestFinalizedHeader returning storage.ErrNotFound errors
Please see this and that comment for details. Instead of addressing the challenge on the level of GetChainIDFromLatestFinalizedHeader and GetLatestFinalizedHeader, I would recommend going to the lowest sensible level, i.e. functions RetrieveFinalizedHeight and RetrieveSealedHeight.
Working on this code is also a great opportunity to add the missing documentation to the functions RetrieveFinalizedHeight and RetrieveSealedHeight 😉
Specifically, I would recommend to add the following dedicated error to storage/operation/heights.go
var (
// IncompleteStateError indicates that some information cannot be retrieved from the database,
// which the protocol mandates to be present. This can be a symptom of a corrupted state
// or an incorrectly / incompletely bootstrapped node. In most cases, this is an exception.
//
// ATTENTION: in most cases, [IncompleteStateError] error is a symptom of a corrupted state
// or an incorrectly / incompletely bootstrapped node. Typically, this is an unexpected exception
// and should not be checked for the same way as benign sentinel errors.
IncompleteStateError = errors.New("data required by protocol is missing in database")
)Then, utilize the knowledge that the requested values should always be present for a properly bootstrapped node in the implementation of RetrieveFinalizedHeight and RetrieveSealedHeight:
// RetrieveFinalizedHeight reads height of the latest finalized block directly from the database.
//
// During bootstrapping, the latest finalized block and its height are indexed and thereafter the
// latest finalized heigh is only updated (but never removed). Hence, for a properly bootstrapped
// node, this function should _always_ return a proper value.
//
// CAUTION: This function should only be called on properly bootstrapped nodes. If the state is
// corrupted or the node is not properly bootstrapped, this function may return [IncompleteStateError].
// The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often
// benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present.
//
// No error returns are expected during normal operations.
func RetrieveFinalizedHeight(r storage.Reader, height *uint64) error {
var h uint64
err := RetrieveByKey(r, MakePrefix(codeFinalizedHeight), &h)
if err != nil {
// mask the lower-level error to prevent confusion with benign the often benign `storage.ErrNotFound`:
return fmt.Errorf("latest finalized height could not be read, which should never happen for bootstrapped nodes: %w", IncompleteStateError)
}
*height = h
return nil
}// RetrieveSealedHeight reads height of the latest sealed block directly from the database.
//
// During bootstrapping, the latest sealed block and its height are indexed and thereafter the
// latest sealed heigh is only updated (but never removed). Hence, for a properly bootstrapped
// node, this function should _always_ return a proper value.
//
// CAUTION: This function should only be called on properly bootstrapped nodes. If the state is
// corrupted or the node is not properly bootstrapped, this function may return [IncompleteStateError].
// The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often
// benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present.
func RetrieveSealedHeight(r storage.Reader, height *uint64) error {
var h uint64
err := RetrieveByKey(r, MakePrefix(codeSealedHeight), &h)
if err != nil {
// mask the lower-level error to prevent confusion with benign the often benign `storage.ErrNotFound`:
return fmt.Errorf("latest sealed height could not be read, which should never happen for bootstrapped nodes: %w", IncompleteStateError)
}
*height = h
return nil
}Please add tests confirming the correct error returns: IncompleteStateError, not ErrNotFound 🙏
| // GetChainIDFromLatestFinalizedHeader attempts to retrieve the consensus chainID | ||
| // from the latest finalized header in the database, before storage or protocol state have been initialized. | ||
| // Expected errors during normal operations: | ||
| // - [storage.ErrNotFound] if the node is not bootstrapped. | ||
| func GetChainIDFromLatestFinalizedHeader(db storage.DB) (flow.ChainID, error) { | ||
| h, err := GetLatestFinalizedHeader(db) | ||
| if err != nil { | ||
| return "", err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerns about the returned error type
I have mixed feelings about this. Normally, the approach you took here is exactly the pattern I would encourage engineers to follow. Typically, from the perspective of the low-level logic, requested data being absent is not necessarily a conclusive sign of state corruption. So we typically just escalate the error, document it properly and let the caller decide.
But but but 😅 ... why would it be legitimate to read data from an uninitialized node? This should never happen. Unless the higher level logic has a bug. In addition, we offer the method IsBootstrapped, by which the caller can check whether the node is bootstrapped. To be clear: I am essentially arguing that we should preempt usage patters (I typically discourage this). The reason is that storage.ErrNotFound is often a benign error but in this case it is most likely not. And if it is mistaken for being benign, by just throwing it up the call stack and letting the top level method decide, which might also have code paths expected to throw benign ErrNotFound then we are in trouble.
Therefore, I would suggest we throw anything but a ErrNotFound error here and explicitly document why.
Function naming
Would suggest to rename this to GetChainID. How precisely we do that is largely in implementation detail, which is already covered in the documentation. This detail is not important enough in my opinion to emphasize it as part of the function name.
potential documentation ambiguity
before storage or protocol state have been initialized.
I feel this wording is a bit ambiguous. The storage and protocol state have to be initialized and persisted to the database. This is essentially the bootstrapping. I think what you are trying to say is that this function reads directly from the database, without instantiating high-level storage abstractions or protocol state structs (?)
Suggestion
| // GetChainIDFromLatestFinalizedHeader attempts to retrieve the consensus chainID | |
| // from the latest finalized header in the database, before storage or protocol state have been initialized. | |
| // Expected errors during normal operations: | |
| // - [storage.ErrNotFound] if the node is not bootstrapped. | |
| func GetChainIDFromLatestFinalizedHeader(db storage.DB) (flow.ChainID, error) { | |
| h, err := GetLatestFinalizedHeader(db) | |
| if err != nil { | |
| return "", err | |
| // GetChainID retrieves the consensus chainID from the latest finalized block in the database. This | |
| // function reads directly from the database, without instantiating high-level storage abstractions | |
| // or the protocol state struct. | |
| // | |
| // During bootstrapping, the latest finalized block and its height are indexed and thereafter the | |
| // latest finalized heigh is only updated (but never removed). Hence, for a properly bootstrapped node, | |
| // this function should _always_ return a proper value (constant throughout the lifetime of the node). | |
| // | |
| // Note: This function should only be called on properly bootstrapped nodes. If the state is corrupted | |
| // or the node is not properly bootstrapped, this function may return [IncompleteStateError]. | |
| // The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often | |
| // benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present. | |
| // | |
| // No error returns are expected during normal operations. | |
| func GetChainID(db storage.DB) (flow.ChainID, error) { | |
| h, err := GetLatestFinalizedHeader(db) // returns [operation.IncompleteStateError] if the required data is not found, but never storage.ErrNotFound. | |
| if err != nil { | |
| return "", fmt.Errorf("failed to read latest finalized block, which should never happen for bootstrapped nodes; call IsBootstrapped upfront if in doubt: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are correct about the intention (reading directly from the database before instantiating the higher-level interfaces).
Good to know that you consider this situation an exception to the default in terms of errors; I agree that we should be using IsBootstrapped anyways.
| // GetLatestFinalizedHeader attempts to retrieve the latest finalized header | ||
| // without going through the storage.Headers interface. | ||
| // Expected errors during normal operations: | ||
| // - [storage.ErrNotFound] if the node is not bootstrapped. | ||
| func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) { | ||
| var finalized uint64 | ||
| r := db.Reader() | ||
| err := operation.RetrieveFinalizedHeight(r, &finalized) | ||
| if err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here: would suggest to return anything but an storage.ErrNotFound error. As this is deviating from a common convention, we should call it out and explain the reasoning:
| // GetLatestFinalizedHeader attempts to retrieve the latest finalized header | |
| // without going through the storage.Headers interface. | |
| // Expected errors during normal operations: | |
| // - [storage.ErrNotFound] if the node is not bootstrapped. | |
| func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) { | |
| var finalized uint64 | |
| r := db.Reader() | |
| err := operation.RetrieveFinalizedHeight(r, &finalized) | |
| if err != nil { | |
| return nil, err | |
| // GetLatestFinalizedHeader retrieves the header of the latest finalized block. This function reads directly | |
| // from the database, without instantiating high-level storage abstractions or the protocol state struct. | |
| // | |
| // During bootstrapping, the latest finalized block and its height are indexed and thereafter the latest | |
| // finalized heigh is only updated (but never removed). Hence, for a properly bootstrapped node, this | |
| // function should _always_ return a proper value. | |
| // | |
| // Note: This function should only be called on properly bootstrapped nodes. If the state is corrupted | |
| // or the node is not properly bootstrapped, this function may return [IncompleteStateError]. | |
| // The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often | |
| // benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present. | |
| // | |
| // No error returns are expected during normal operations. | |
| func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) { | |
| var finalized uint64 | |
| r := db.Reader() | |
| err := operation.RetrieveFinalizedHeight(r, &finalized) // returns [operation.IncompleteStateError] if the required data is not found, but never storage.ErrNotFound. | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to read latest finalized height, which should never happen for bootstrapped nodes; call IsBootstrapped upfront if in doubt: %w", err) |
|
|
||
| var _ storage.Headers = (*Headers)(nil) | ||
|
|
||
| // NewHeaders creates a Headers instance, which stores block headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // NewHeaders creates a Headers instance, which stores block headers. | |
| // NewHeaders creates a Headers instance, which manages block headers of the main consensus (not cluster consensus). |
| // This error allows the caller to detect duplicate inserts. If the header is stored along with other parts | ||
| // of the block in the same batch, similar duplication checks can be skipped for storing other parts of the block. | ||
| // No other error returns are expected during normal operation. | ||
| func InsertClusterHeader(lctx lockctx.Proof, rw storage.ReaderBatchWriter, headerID flow.Identifier, header *flow.Header) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test analogous to TestHeaderInsertCheckRetrieve for the cluster operations.
In addition, tests that verify that InsertHeader and InsertClusterHeader fail if the wrong or no lock is acquired would be great (those tests are missing at the moment entirely for both methods).
| ctx.refChainFinalizedHeight = mainChainFinalizedHeader.Height | ||
| ctx.refChainFinalizedID = mainChainFinalizedHeader.ID() | ||
|
|
||
| // If we don't have the epoch boundaries (first/final height ON MAIN CHAIN) cached, try retrieve and cache them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if the comment is misleading or I am misunderstanding it 😅. I think this block of code does not do anything with the "final height", right?
| // If we don't have the epoch boundaries (first/final height ON MAIN CHAIN) cached, try retrieve and cache them | |
| // We can't specify the height of the epoch's first consensus block (height ON MAIN CHAIN) during which this cluster is | |
| // active, because the builder is typically _instantiated_ before the epoch starts. However, the builder should only be | |
| // called once the epoch has started, i.e. consensus has finalized the first block in the epoch. Consequently, we | |
| // retrieve the epoch's first height on the first call of the builder, and cache it for future calls. |
| // can be missing if we joined (dynamic bootstrapped) in the middle of an epoch. | ||
| // 0 means FinalizedAncestryLookup will not be bounded by the epoch start, | ||
| // but only by which cluster blocks we have available. | ||
| refEpochFirstHeight = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ I am not sure this is correct
Assume the following situation:
- the collector joined late (dynamic bootstrapped)
- Some access node sends it a transaction that was already included before the collector joined (AN might be byzantine, or AN might be behind).
- The collector can't just scan it's locally known history and include the transaction if it doesn't appear in any blocks it knows. It has to guarantee that the transaction does no appear in the fork, no matter how little of the fork's history the collector node knows. Otherwise, it cannot propose (or risks to be slashed).
Hence, I would conclude that just scanning the "cluster blocks we have available" is insufficient. Lets make sure we think this properly through and document the reasoning why the algorithm also works for short histories.
I'll try to look at this tomorrow again with a fresh brain. Very well possible that the argument can be deduced from the code (we should document it nonetheless). If we can't work this out, lets try to ask Jordan for advice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right that there is a potential issue here.
Co-authored-by: Alexander Hentschel <[email protected]>
b5ccec2 to
d2f2846
Compare
| start = minRefHeight - flow.DefaultTransactionExpiry + 1 | ||
| if start > minRefHeight { | ||
| start = 0 // overflow check | ||
| func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think it would help to document the computation we perform here in a lot of detail, and tie it to the logic calling this function. Suggestion:
| func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) { | |
| // findRefHeightSearchRangeForConflictingClusterBlocks computes the range of reference block heights of ancestor blocks | |
| // which could possibly contain transactions duplicating those in our collection under construction, based on the range | |
| // of reference heights of transactions in the collection under construction. | |
| // Input range is the (inclusive) range of reference heights of transactions eligible for inclusion in the collection | |
| // under construction. Output range is the (inclusive) range of reference heights which need to be searched in order to | |
| // avoid transaction repeats. | |
| // | |
| // Within a single epoch, we have argued that for a set of transactions, with `minRefHeight` (`maxRefHeight`) being | |
| // the smallest (largest) reference block height, we only need to inspect collections with reference block heights | |
| // c ∈ (minRefHeight-E, maxRefHeight]. Note that the lower bound is exclusive, while the upper bound is inclusive, | |
| // which we transform to an inclusive range: | |
| // | |
| // c ∈ (minRefHeight-E, maxRefHeight] | |
| // ⇔ c ∈ [minRefHeight-E+1, maxRefHeight] | |
| // | |
| // In order to take epoch boundaries into account, we note: A collector cluster is only responsible for transactions whose | |
| // reference blocks are within the cluster's operating epoch. Thus, we can bound the lower end of the search range by the | |
| // height of the first block in the epoch. Formally, we only need to inspect collections with reference block height | |
| // | |
| // c ∈ [max{minRefHeight-E+1, epochFirstHeight}, maxRefHeight] | |
| func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) { | |
| // in order to avoid underflow, we rewrite the lower-bound equation entirely without subtraction: | |
| // max{minRefHeight-E+1, epochFirstHeight} == epochFirstHeight | |
| // ⇔ minRefHeight - E + 1 ≤ epochFirstHeight | |
| // ⇔ minRefHeight - E < epochFirstHeight | |
| // ⇔ minRefHeight < epochFirstHeight + E | |
| if minRefHeight < ctx.refEpochFirstHeight+flow.DefaultTransactionExpiry { | |
| return ctx.refEpochFirstHeight, maxRefHeight | |
| } | |
| // We reach the following line only if minRefHeight-E+1 > epochFirstHeight ≥ 0. Hence, an underflow is impossible. | |
| return minRefHeight + 1 - flow.DefaultTransactionExpiry, maxRefHeight | |
| } |
WalkthroughThis PR implements chain ID awareness throughout the Flow storage layer, enabling the differentiation between consensus and cluster headers. The changes bind a chain ID to each Headers instance at construction, introduce cluster-specific header operations, add chain validation during header retrieval, and thread the chain ID parameter through storage initialization across the codebase. Changes
Sequence Diagram(s)sequenceDiagram
participant Init as Initialization
participant ChainID as Chain ID Resolver
participant Storage as Storage Layer
participant Headers as Headers Store
Init->>ChainID: determineChainID() or<br/>GetChainIDFromLatestFinalizedHeader()
ChainID->>Storage: Query latest finalized header
Storage-->>ChainID: Return header with ChainID
ChainID-->>Init: Resolved chainID
Init->>Storage: InitAll(metrics, db, chainID)
Storage->>Headers: NewHeaders(collector, db, chainID) or<br/>NewClusterHeaders(collector, db, chainID)
Headers->>Headers: Store chainID internally
Headers-->>Storage: Headers instance bound to chainID
Storage-->>Init: All storage initialized
Note over Headers: Later retrieval operations<br/>validate header.ChainID<br/>== configured chainID
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring special attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
storage/operation/headers.go (1)
57-73: LGTM: Cluster header insertion mirrors the consensus header pattern.
InsertClusterHeadercorrectly uses the cluster-specific lock (LockInsertOrFinalizeClusterBlock) while maintaining the same existence-check-before-upsert pattern asInsertHeader. Both functions share the same key namespace (codeHeader), which is intentional since headers are globally unique by ID regardless of whether they come from consensus or cluster chains.Reminder: A past review comment requested adding tests analogous to
TestHeaderInsertCheckRetrievefor cluster operations, as well as tests verifying lock acquisition. Please ensure these are addressed.module/builder/collection/builder.go (1)
645-651: Potential off-by-two error in search range calculation.A past review identified a sign flip issue in this function. The mathematical derivation states the search range should be:
c ∈ [max{minRefHeight - E + 1, epochFirstHeight}, maxRefHeight]However, the current implementation uses
delta = E + 1, resulting in:
minRefHeight - delta=minRefHeight - E - 1This differs from the expected
minRefHeight - E + 1by 2.Example with E=600, epochFirstHeight=100, minRefHeight=1500:
- Expected start: max{1500 - 600 + 1, 100} = 901
- Current code: 1500 - 601 = 899
This could cause the search to start 2 blocks earlier than necessary (minor performance impact) or potentially miss edge cases.
🔎 Suggested fix based on past review analysis
func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) { - delta := uint64(flow.DefaultTransactionExpiry + 1) - if minRefHeight <= ctx.refEpochFirstHeight+delta { + // We need to search collections with reference height c ∈ (minRefHeight-E, maxRefHeight] + // Converting to inclusive range: c ∈ [minRefHeight-E+1, maxRefHeight] + // Bounded by epoch start: c ∈ [max{minRefHeight-E+1, epochFirstHeight}, maxRefHeight] + if minRefHeight < ctx.refEpochFirstHeight+flow.DefaultTransactionExpiry { return ctx.refEpochFirstHeight, maxRefHeight // bound at start of epoch } - return minRefHeight - delta, maxRefHeight + // Safe from underflow since minRefHeight >= epochFirstHeight + E >= E + return minRefHeight + 1 - flow.DefaultTransactionExpiry, maxRefHeight }
🧹 Nitpick comments (8)
network/channels/errors_test.go (1)
32-33: Consider implementing the previously suggested solution usingpackage channels_test.While the comment documents the circular dependency workaround, the previous review discussion identified a cleaner solution: changing the package declaration to
package channels_test(line 1). This is a standard Go testing pattern that would eliminate the circular dependency and allow usingCanonicalClusterIDconsistently with the rest of the codebase, as you acknowledged would work.🔎 Proposed refactor to use the test package pattern
At the top of the file (line 1):
-package channels +package channels_testThen update line 16 to use the package qualifier:
- err := NewInvalidTopicErr(topic, wrapErr) + err := channels.NewInvalidTopicErr(topic, wrapErr)Similarly for line 35:
- err := NewUnknownClusterIdErr(clusterId, activeClusterIds) + err := channels.NewUnknownClusterIdErr(clusterId, activeClusterIds)And update the Is* function calls (lines 23, 27, 42, 46) with the
channels.qualifier.Then replace lines 32-33:
- // NOTE: we do not use CanonicalClusterID here to avoid circular dependency - clusterId := flow.ChainID("cluster-id") + clusterId := flow.CanonicalClusterID("cluster-id")integration/tests/access/cohort4/execution_data_pruning_test.go (1)
166-166: LGTM! Chain ID correctly matches the network configuration.The addition of
flow.Localnetas the chain ID parameter is consistent with the network setup on line 129 wheretestnet.PrepareFlowNetworkis called withflow.Localnet.Optional: Consider creating Observer Node headers for consistency
While the current implementation works correctly (using Access Node headers to look up blocks for both nodes), you could optionally create a separate headers instance for the Observer Node for consistency:
onHeaders := store.NewHeaders(metrics, onDB, flow.Localnet)This isn't necessary for the test's current logic since both nodes are on the same network and have identical headers, but it would make the Observer Node setup symmetric with the Access Node setup.
cmd/util/cmd/exec-data-json-export/transaction_exporter.go (1)
52-64: LGTM! Consider extracting the chainID retrieval pattern.The chainID retrieval and header initialization logic is correct. However, this exact pattern is duplicated across multiple exporter files (transaction_exporter.go, delta_snapshot_exporter.go, result_exporter.go, event_exporter.go, and block_exporter.go).
Optional: Extract common pattern into a helper function
Consider creating a helper function in the
commonpackage to reduce duplication:// In cmd/util/cmd/common/storage.go func InitHeadersWithChainID(db storage.DB, cacheMetrics module.CacheMetrics) (*store.Headers, error) { chainID, err := badgerstate.GetChainIDFromLatestFinalizedHeader(db) if err != nil { return nil, err } return store.NewHeaders(cacheMetrics, db, chainID), nil }Then use it in exporters:
headers, err := common.InitHeadersWithChainID(db, cacheMetrics) if err != nil { return err }storage/badger/all.go (1)
13-14: Deprecation comment format should follow Go conventions.The deprecation comment should use the standard Go
Deprecated:format to be recognized by tools likego vetand IDEs.🔎 Suggested fix
-// deprecated: use [store.InitAll] instead +// Deprecated: use [store.InitAll] instead. func InitAll(metrics module.CacheMetrics, db *badger.DB, chainID flow.ChainID) *storage.All {cmd/util/cmd/read-badger/cmd/cluster_blocks.go (1)
37-43: Consider adding validation before callingNewClusterHeaders.If a user provides a non-cluster chain ID (e.g.,
flow-emulator),NewClusterHeaderswill panic. For a CLI tool, a user-friendly error message might be preferable.🔎 Proposed validation
// get chain id log.Info().Msgf("got flag chain name: %s", flagChainName) chainID := flow.ChainID(flagChainName) + if !cluster.IsCanonicalClusterID(chainID) { + return fmt.Errorf("chain ID %q is not a valid cluster chain ID", chainID) + } clusterHeaders := store.NewClusterHeaders(metrics, db, chainID)This would require importing the
clusterpackage.state/protocol/badger/state.go (1)
992-1032: Error handling partially addresses past feedback but remains inconsistent.Based on past review comments from AlexHentschel, there was a suggestion to avoid returning
storage.ErrNotFounddirectly and instead useIncompleteStateErrorto distinguish between benign not-found errors and actual state corruption. The current implementation:
- Line 1013: Returns bare error from
RetrieveFinalizedHeight, which could bestorage.ErrNotFound- Lines 1018-1020, 1026-1028: Wraps
ErrNotFoundwith contextual messages but doesn't wrap asIncompleteStateErrorThis creates inconsistency where the same sentinel error (
storage.ErrNotFound) is returned differently depending on which lookup fails. Consider applying the suggested pattern from past reviews to wrap all not-found errors consistently.🔎 Suggested consistent error wrapping
func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) { var finalized uint64 r := db.Reader() err := operation.RetrieveFinalizedHeight(r, &finalized) if err != nil { + if errors.Is(err, storage.ErrNotFound) { + return nil, fmt.Errorf("could not retrieve finalized height: node may not be bootstrapped: %w", err) + } return nil, err }state/cluster/badger/mutator_test.go (1)
447-452: Test re-enabled and validates cluster reference block rejection.This test was previously skipped and is now enabled to verify that using a cluster block as a reference block is rejected. For consistency with similar tests in this file (e.g.,
TestExtend_WithReferenceBlockFromDifferentEpochat lines 466-469), consider adding an assertion on the error type.🔎 Optional: Add error type assertion
func (suite *MutatorSuite) TestExtend_WithReferenceBlockFromClusterChain() { // set genesis from cluster chain as reference block proposal := suite.ProposalWithParentAndPayload(suite.genesis, *model.NewEmptyPayload(suite.genesis.ID())) err := suite.state.Extend(&proposal) suite.Assert().Error(err) + suite.Assert().True(state.IsInvalidExtensionError(err)) }storage/store/headers.go (1)
80-83: Document the unsupportedByViewbehavior for cluster headers.The
retrieveViewfunction returns an error for cluster headers, but this limitation should be documented in thestorage.Headersinterface'sByViewmethod. Users of cluster headers should know that view-based lookups are not available.Per past review comments from AlexHentschel, consider adding a dedicated sentinel error (e.g.,
ErrNotAvailableForClusterConsensus) and documenting it in the interface.🔎 Suggested sentinel error
// In storage/errors.go, add: var ErrNotAvailableForClusterConsensus = errors.New("operation not available for cluster consensus")Then in
retrieveView:retrieveView := func(r storage.Reader, height uint64) (flow.Identifier, error) { var id flow.Identifier - return id, fmt.Errorf("retrieve by view not implemented for cluster headers") + return id, fmt.Errorf("ByView not supported for cluster headers: %w", storage.ErrNotAvailableForClusterConsensus) }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (82)
admin/commands/storage/read_range_cluster_blocks.go(3 hunks)cmd/collection/main.go(1 hunks)cmd/scaffold.go(2 hunks)cmd/util/cmd/common/storage.go(2 hunks)cmd/util/cmd/exec-data-json-export/block_exporter.go(2 hunks)cmd/util/cmd/exec-data-json-export/delta_snapshot_exporter.go(2 hunks)cmd/util/cmd/exec-data-json-export/event_exporter.go(2 hunks)cmd/util/cmd/exec-data-json-export/result_exporter.go(2 hunks)cmd/util/cmd/exec-data-json-export/transaction_exporter.go(2 hunks)cmd/util/cmd/export-json-transactions/cmd.go(2 hunks)cmd/util/cmd/export-json-transactions/transactions/range_test.go(1 hunks)cmd/util/cmd/find-inconsistent-result/cmd.go(2 hunks)cmd/util/cmd/read-badger/cmd/blocks.go(2 hunks)cmd/util/cmd/read-badger/cmd/cluster_blocks.go(1 hunks)cmd/util/cmd/read-badger/cmd/collections.go(5 hunks)cmd/util/cmd/read-badger/cmd/epoch_commit.go(3 hunks)cmd/util/cmd/read-badger/cmd/epoch_protocol_state.go(3 hunks)cmd/util/cmd/read-badger/cmd/guarantees.go(3 hunks)cmd/util/cmd/read-badger/cmd/protocol_kvstore.go(3 hunks)cmd/util/cmd/read-badger/cmd/seals.go(4 hunks)cmd/util/cmd/read-badger/cmd/transaction_results.go(2 hunks)cmd/util/cmd/read-badger/cmd/transactions.go(3 hunks)cmd/util/cmd/read-light-block/read_light_block_test.go(1 hunks)cmd/util/cmd/read-protocol-state/cmd/blocks.go(2 hunks)cmd/util/cmd/read-protocol-state/cmd/snapshot.go(2 hunks)cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height.go(3 hunks)cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height_test.go(2 hunks)cmd/util/cmd/snapshot/cmd.go(2 hunks)cmd/util/cmd/verify-evm-offchain-replay/verify.go(2 hunks)consensus/integration/nodes_test.go(1 hunks)consensus/recovery/protocol/state_test.go(1 hunks)engine/access/access_test.go(4 hunks)engine/access/ingestion/collections/indexer_test.go(1 hunks)engine/access/rpc/backend/transactions/transactions_functional_test.go(1 hunks)engine/collection/compliance/engine_test.go(2 hunks)engine/collection/epochmgr/engine.go(1 hunks)engine/collection/epochmgr/engine_test.go(3 hunks)engine/collection/epochmgr/factories/cluster_state.go(2 hunks)engine/collection/epochmgr/factories/epoch.go(3 hunks)engine/collection/epochmgr/factory.go(2 hunks)engine/collection/epochmgr/mock/epoch_components_factory.go(3 hunks)engine/collection/message_hub/message_hub_test.go(1 hunks)engine/collection/test/cluster_switchover_test.go(1 hunks)engine/common/follower/integration_test.go(1 hunks)engine/execution/pruner/core_test.go(1 hunks)engine/testutil/nodes.go(2 hunks)engine/verification/verifier/verifiers.go(1 hunks)integration/testnet/container.go(1 hunks)integration/tests/access/cohort4/access_test.go(1 hunks)integration/tests/access/cohort4/execution_data_pruning_test.go(1 hunks)model/flow/chain.go(1 hunks)module/block_iterator/iterator_test.go(2 hunks)module/builder/collection/builder.go(4 hunks)module/builder/collection/builder_test.go(29 hunks)module/executiondatasync/optimistic_sync/pipeline/pipeline_functional_test.go(1 hunks)module/finalizedreader/finalizedreader_test.go(2 hunks)module/finalizer/collection/finalizer_test.go(2 hunks)module/finalizer/consensus/finalizer_test.go(3 hunks)network/channels/errors_test.go(1 hunks)state/cluster/badger/mutator.go(9 hunks)state/cluster/badger/mutator_test.go(2 hunks)state/cluster/badger/snapshot_test.go(2 hunks)state/cluster/root_block.go(1 hunks)state/protocol/badger/mutator_test.go(4 hunks)state/protocol/badger/state.go(1 hunks)state/protocol/badger/state_test.go(3 hunks)state/protocol/util/testing.go(9 hunks)storage/badger/all.go(1 hunks)storage/errors.go(1 hunks)storage/headers.go(2 hunks)storage/operation/cluster.go(2 hunks)storage/operation/cluster_test.go(5 hunks)storage/operation/headers.go(1 hunks)storage/store/blocks.go(9 hunks)storage/store/blocks_test.go(7 hunks)storage/store/cluster_blocks_test.go(1 hunks)storage/store/guarantees_test.go(3 hunks)storage/store/headers.go(8 hunks)storage/store/headers_test.go(7 hunks)storage/store/init.go(2 hunks)storage/store/payloads_test.go(2 hunks)utils/unittest/cluster_block.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (69)
module/finalizer/collection/finalizer_test.go (3)
utils/unittest/block.go (1)
BlockFixture(14-21)utils/unittest/locks.go (1)
WithLock(15-26)storage/locks.go (1)
LockInsertBlock(14-14)
engine/access/rpc/backend/transactions/transactions_functional_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
cmd/util/cmd/find-inconsistent-result/cmd.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
cmd/util/cmd/common/storage.go (4)
model/flow/chain.go (1)
ChainID(14-14)storage/store/init.go (2)
All(9-29)InitAll(34-76)module/metrics/noop.go (1)
NoopCollector(22-22)storage/badger/all.go (1)
InitAll(14-53)
cmd/util/cmd/verify-evm-offchain-replay/verify.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
engine/verification/verifier/verifiers.go (1)
cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
cmd/util/cmd/export-json-transactions/cmd.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
cmd/util/cmd/read-protocol-state/cmd/snapshot.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
engine/collection/message_hub/message_hub_test.go (2)
state/cluster/root_block.go (1)
CanonicalClusterID(18-20)utils/unittest/fixtures.go (1)
IdentifierListFixture(1143-1149)
admin/commands/storage/read_range_cluster_blocks.go (4)
storage/store/cluster_payloads.go (1)
ClusterPayloads(16-19)storage/store/headers.go (1)
NewClusterHeaders(62-85)module/metrics/noop.go (1)
NoopCollector(22-22)model/flow/chain.go (1)
ChainID(14-14)
engine/collection/compliance/engine_test.go (2)
state/cluster/root_block.go (1)
CanonicalClusterID(18-20)utils/unittest/fixtures.go (1)
IdentifierListFixture(1143-1149)
engine/collection/epochmgr/engine.go (1)
model/flow/chain.go (1)
ChainID(14-14)
engine/execution/pruner/core_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
cmd/util/cmd/read-protocol-state/cmd/blocks.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
storage/operation/cluster.go (1)
storage/operation/headers.go (1)
InsertClusterHeader(57-73)
integration/tests/access/cohort4/execution_data_pruning_test.go (2)
storage/store/headers.go (1)
NewHeaders(33-57)model/flow/chain.go (1)
Localnet(35-35)
cmd/util/cmd/read-badger/cmd/transaction_results.go (3)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)storage/store/transaction_results.go (1)
NewTransactionResults(62-130)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
module/executiondatasync/optimistic_sync/pipeline/pipeline_functional_test.go (1)
model/flow/chain.go (1)
ChainID(14-14)
utils/unittest/cluster_block.go (3)
model/flow/chain.go (1)
ChainID(14-14)state/cluster/root_block.go (1)
CanonicalClusterID(18-20)utils/unittest/fixtures.go (1)
IdentifierListFixture(1143-1149)
storage/badger/all.go (3)
storage/store/init.go (2)
InitAll(34-76)All(9-29)model/flow/chain.go (1)
ChainID(14-14)storage/store/headers.go (1)
NewHeaders(33-57)
consensus/integration/nodes_test.go (2)
storage/store/headers.go (1)
NewHeaders(33-57)model/flow/chain.go (1)
ChainID(14-14)
storage/store/headers_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/store/headers.go (3)
NewHeaders(33-57)Headers(18-26)NewClusterHeaders(62-85)
cmd/util/cmd/read-badger/cmd/cluster_blocks.go (2)
storage/store/headers.go (1)
NewClusterHeaders(62-85)storage/store/cluster_payloads.go (1)
NewClusterPayloads(23-37)
storage/operation/cluster_test.go (3)
model/flow/chain.go (1)
ChainID(14-14)state/cluster/root_block.go (1)
CanonicalClusterID(18-20)utils/unittest/fixtures.go (1)
IdentifierListFixture(1143-1149)
engine/access/access_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
engine/access/ingestion/collections/indexer_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
state/protocol/util/testing.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
module/block_iterator/iterator_test.go (2)
model/flow/chain.go (1)
ChainID(14-14)storage/store/headers.go (1)
NewHeaders(33-57)
cmd/util/cmd/snapshot/cmd.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)
engine/common/follower/integration_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
cmd/util/cmd/read-badger/cmd/protocol_kvstore.go (1)
module/metrics/noop.go (1)
NoopCollector(22-22)
engine/collection/epochmgr/factory.go (1)
model/flow/chain.go (1)
ChainID(14-14)
state/protocol/badger/state_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
cmd/util/cmd/read-badger/cmd/guarantees.go (1)
module/metrics/noop.go (1)
NoopCollector(22-22)
storage/store/blocks_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
cmd/util/cmd/export-json-transactions/transactions/range_test.go (2)
cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)model/flow/chain.go (1)
ChainID(14-14)
state/protocol/badger/mutator_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
cmd/collection/main.go (1)
admin/commands/storage/read_range_cluster_blocks.go (1)
NewReadRangeClusterBlocksCommand(29-34)
cmd/util/cmd/read-badger/cmd/seals.go (1)
module/metrics/noop.go (1)
NoopCollector(22-22)
cmd/util/cmd/exec-data-json-export/result_exporter.go (3)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)module/metrics/noop.go (1)
NoopCollector(22-22)storage/store/headers.go (1)
NewHeaders(33-57)
cmd/util/cmd/read-badger/cmd/blocks.go (2)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)storage/store/headers.go (1)
NewHeaders(33-57)
state/cluster/badger/snapshot_test.go (5)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)storage/store/cluster_payloads.go (1)
NewClusterPayloads(23-37)storage/store/headers.go (2)
NewClusterHeaders(62-85)Headers(18-26)state/cluster/badger/mutator.go (1)
NewMutableState(34-44)
integration/testnet/container.go (2)
storage/store/headers.go (1)
NewHeaders(33-57)model/flow/chain.go (1)
ChainID(14-14)
module/finalizedreader/finalizedreader_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
engine/testutil/nodes.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
storage/store/cluster_blocks_test.go (1)
model/flow/chain.go (1)
ChainID(14-14)
consensus/recovery/protocol/state_test.go (1)
storage/store/headers.go (1)
NewHeaders(33-57)
engine/collection/epochmgr/factories/cluster_state.go (4)
model/flow/chain.go (1)
ChainID(14-14)state/cluster/badger/mutator.go (2)
MutableState(23-30)NewMutableState(34-44)storage/store/headers.go (3)
Headers(18-26)NewClusterHeaders(62-85)NewHeaders(33-57)storage/store/cluster_payloads.go (2)
ClusterPayloads(16-19)NewClusterPayloads(23-37)
storage/store/init.go (4)
storage/badger/all.go (1)
InitAll(14-53)module/metrics.go (1)
CacheMetrics(509-518)model/flow/chain.go (1)
ChainID(14-14)storage/store/headers.go (1)
NewHeaders(33-57)
storage/store/payloads_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
storage/operation/headers.go (5)
storage/locks.go (2)
LockInsertBlock(14-14)LockInsertOrFinalizeClusterBlock(22-22)storage/operation/prefix.go (1)
MakePrefix(134-146)storage/operation/reads.go (1)
KeyExists(112-128)storage/errors.go (1)
ErrAlreadyExists(22-22)storage/operations.go (1)
Writer(107-126)
cmd/util/cmd/read-badger/cmd/collections.go (1)
module/metrics/noop.go (1)
NoopCollector(22-22)
cmd/util/cmd/exec-data-json-export/event_exporter.go (3)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)module/metrics/noop.go (1)
NoopCollector(22-22)storage/store/headers.go (1)
NewHeaders(33-57)
cmd/scaffold.go (5)
state/protocol/badger/state.go (2)
IsBootstrapped(980-990)GetChainIDFromLatestFinalizedHeader(996-1002)cmd/node_builder.go (1)
BaseConfig(140-190)integration/localnet/builder/bootstrap.go (1)
BootstrapDir(29-29)model/flow/chain.go (1)
ChainID(14-14)storage/store/headers.go (1)
NewHeaders(33-57)
cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height.go (4)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)cmd/util/cmd/common/storage.go (1)
InitStorages(61-64)storage/store/my_receipts.go (1)
NewMyExecutionReceipts(29-56)storage/store/headers.go (1)
Headers(18-26)
cmd/util/cmd/exec-data-json-export/delta_snapshot_exporter.go (3)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)module/metrics/noop.go (1)
NoopCollector(22-22)storage/store/headers.go (1)
NewHeaders(33-57)
module/builder/collection/builder_test.go (4)
storage/store/headers.go (2)
Headers(18-26)NewClusterHeaders(62-85)storage/store/cluster_payloads.go (2)
ClusterPayloads(16-19)NewClusterPayloads(23-37)storage/store/init.go (1)
InitAll(34-76)state/cluster/badger/mutator.go (1)
NewMutableState(34-44)
cmd/util/cmd/read-badger/cmd/transactions.go (1)
module/metrics/noop.go (1)
NoopCollector(22-22)
state/cluster/badger/mutator.go (2)
storage/store/cluster_payloads.go (1)
ClusterPayloads(16-19)state/cluster/badger/state.go (1)
State(20-24)
storage/store/guarantees_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
engine/collection/epochmgr/engine_test.go (2)
model/flow/chain.go (1)
ChainID(14-14)state/protocol/mock/params.go (1)
NewParams(171-181)
module/builder/collection/builder.go (3)
storage/operation/heights.go (1)
RetrieveEpochFirstHeight(65-67)storage/errors.go (1)
ErrNotFound(17-17)model/flow/constants.go (1)
DefaultTransactionExpiry(22-22)
cmd/util/cmd/exec-data-json-export/block_exporter.go (3)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)module/metrics/noop.go (1)
NoopCollector(22-22)storage/store/headers.go (1)
NewHeaders(33-57)
cmd/util/cmd/read-light-block/read_light_block_test.go (2)
storage/store/headers.go (1)
NewClusterHeaders(62-85)model/flow/chain.go (1)
ChainID(14-14)
state/cluster/badger/mutator_test.go (3)
storage/store/init.go (1)
InitAll(34-76)storage/store/headers.go (2)
NewClusterHeaders(62-85)Headers(18-26)state/cluster/badger/mutator.go (1)
NewMutableState(34-44)
storage/store/headers.go (5)
storage/headers.go (1)
Headers(15-63)state/cluster/root_block.go (1)
IsCanonicalClusterID(26-28)storage/errors.go (1)
ErrWrongChain(37-37)storage/locks.go (2)
LockInsertBlock(14-14)LockInsertOrFinalizeClusterBlock(22-22)storage/operation/headers.go (1)
InsertHeader(26-42)
engine/collection/epochmgr/factories/epoch.go (1)
model/flow/chain.go (1)
ChainID(14-14)
cmd/util/cmd/exec-data-json-export/transaction_exporter.go (3)
state/protocol/badger/state.go (1)
GetChainIDFromLatestFinalizedHeader(996-1002)storage/store/index.go (1)
NewIndex(19-39)storage/store/headers.go (1)
NewHeaders(33-57)
cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height_test.go (2)
storage/store/init.go (1)
InitAll(34-76)storage/badger/all.go (1)
InitAll(14-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./integration/)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
| clusterHeaders := store.NewClusterHeaders(&metrics.NoopCollector{}, c.db, flow.ChainID(chainID)) | ||
| clusterBlocks := store.NewClusterBlocks( | ||
| c.db, flow.ChainID(chainID), c.headers, c.payloads, | ||
| c.db, flow.ChainID(chainID), clusterHeaders, c.payloads, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle potential panic from invalid chain ID.
store.NewClusterHeaders panics if the provided chainID is not a canonical cluster chain ID (per the relevant code snippet from storage/store/headers.go:61-84). Since chainID comes from user input at line 37, an invalid chain ID would cause a panic instead of returning a user-friendly error.
🔎 Suggested validation before creating headers
+import "github.com/onflow/flow-go/state/cluster"
+
func (c *ReadRangeClusterBlocksCommand) Handler(ctx context.Context, req *admin.CommandRequest) (interface{}, error) {
chainID, err := parseString(req, "chain-id")
if err != nil {
return nil, err
}
+ if !cluster.IsCanonicalClusterID(flow.ChainID(chainID)) {
+ return nil, admin.NewInvalidAdminReqErrorf("invalid cluster chain ID: %s", chainID)
+ }
reqData, err := parseHeightRangeRequestData(req)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In admin/commands/storage/read_range_cluster_blocks.go around lines 53 to 56,
creating clusterHeaders with store.NewClusterHeaders can panic for a
non-canonical chainID coming from user input; validate the chainID before
calling NewClusterHeaders (e.g., call the canonical/validation routine provided
by the Flow/store packages or write a simple check that rejects non-cluster
canonical IDs) and return a clear user-facing error if invalid, only
constructing clusterHeaders/clusterBlocks after the chainID passes validation so
the code never triggers the panic.
A ChainID must now be provided to Headers storage instance on creation.
That storage instance will then only be able to successfully store or retrieve headers corresponding to the correct ChainID. In addition, the height-based index will also be specific to that ChainID.
Fixes a bug where cluster blocks from the previous epoch+chain would be unnecessarily checked for duplicate transactions.
Closes: #4204
Summary by CodeRabbit
New Features
Bug Fixes
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.