Differentiate between Consensus and Cluster Headers storage #8222

tim-barry · 2025-12-04T00:08:37Z

A ChainID must now be provided to Headers storage instance on creation.
That storage instance will then only be able to successfully store or retrieve headers corresponding to the correct ChainID. In addition, the height-based index will also be specific to that ChainID.

Fixes a bug where cluster blocks from the previous epoch+chain would be unnecessarily checked for duplicate transactions.

Closes: #4204

Summary by CodeRabbit

New Features
- Enhanced multi-chain support with explicit chain context throughout storage initialization.
- Separate cluster consensus header handling for improved chain isolation and validation.
Bug Fixes
- Added chain validation to prevent cross-chain data contamination and ensure data integrity.
Improvements
- Refined storage initialization to enforce chain context requirements across all operations.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Weakens the chainID requirement for cluster chains when reading from storage.

github-actions · 2025-12-04T00:08:53Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

codecov-commenter · 2025-12-04T00:12:03Z

Codecov Report

❌ Patch coverage is 20.49470% with 225 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
cmd/scaffold.go	0.00%	26 Missing ⚠️
state/protocol/badger/state.go	0.00%	24 Missing ⚠️
...llection/epochmgr/mock/epoch_components_factory.go	0.00%	20 Missing ⚠️
storage/store/headers.go	52.77%	11 Missing and 6 partials ⚠️
...ine/collection/epochmgr/factories/cluster_state.go	0.00%	11 Missing ⚠️
...ck-executed-height/cmd/rollback_executed_height.go	0.00%	9 Missing ⚠️
state/protocol/util/testing.go	0.00%	9 Missing ⚠️
storage/operation/headers.go	46.15%	4 Missing and 3 partials ⚠️
...d/util/cmd/read-badger/cmd/epoch_protocol_state.go	0.00%	6 Missing ⚠️
module/builder/collection/builder.go	68.42%	5 Missing and 1 partial ⚠️
... and 30 more

📢 Thoughts on this report? Let us know!

storage/store/headers.go

cmd/util/cmd/read-badger/cmd/collections.go

cmd/scaffold.go

state/protocol/badger/state.go

cmd/scaffold.go

engine/collection/epochmgr/factory.go

jordanschalm · 2025-12-10T19:55:17Z

module/builder/collection/builder.go


 	for _, blockID := range clusterBlockIDs {
-		header, err := b.clusterHeaders.ByBlockID(blockID)
+		header, err := b.clusterHeaders.ByBlockID(blockID) // TODO(4204) transaction deduplication crosses clusterHeaders epoch boundary


Transaction de-duplication actually does not occur across cluster and epoch boundaries.

Each transaction is uniquely assigned to one cluster in one epoch, based on the transaction's reference block (see ingestion logic)

Therefore, each cluster has a range of reference block heights they can accept. These ranges are equivalent to the height range of blocks within an epoch ($[FirstBlockInEpoch.Height, LastBlockInEpoch.Height]$. These ranges are consecutive and do not overlap.

In short, if we are considering a cluster block with reference block height $FirstBlockInEpoch.Height$, then minRefHeight is actually $FirstBlockInEpoch.Height$ (we don't need to search further back).

We already take this into account when determining the lowest possible reference block

So I think we can remove this TODO, and remove the special-case logic in storage.Headers meant to work around this. I would also suggest adding some documentation here explaining why there is no overlap between clusters and epochs.

Thank you for clarifying this explicitly. Now that I understand better what's going on, I can say that while you are right that that's what should be happening, it is not, due to two bugs 🐞🐞. The existing behaviour did actually look across the epoch boundary (now causing clustover_switchover_test to fail), because:

🐞 ctx.refEpochFirstHeight is never initialized in the block builder, so defaults to 0. (Introduced in Enforce that collection reference blocks are bound to the cluster's operating epoch #4148 and not caught in Removes unused first height field from LN builder #6828)
Essentially, the minRefHeight did not get clamped to the start of the epoch, and was always LastFinalizedBlockInEpoch.Height - DefaultTransactionExpiry.

In addition, the minimum height of the range is decreased twice - once in lowestPossibleReferenceBlockHeight() (respecting the epoch boundary if not for bug 1) and then again in findRefHeightSearchRangeForConflictingClusterBlocks (🐞 not respecting the epoch boundary).

I believe this means that this particular lookup was always checking approximately the past ~1200 reference heights (flow.DefaultTransactionExpiry * 2), regardless of which epoch those heights were in.

While the first bug also affects payload construction, I don't think it has impacted correctness (essentially transactions from the previous epoch would not immediately be considered expired, but they are already split into pools by epoch anyways as you have noted, so transactions from a different epoch will not be encountered.)

You're right. Oops, that was introduced by me 🫣.

Here's what I think we should do:

bring back the logic that populates refEpochFirstHeight

Check for ErrNotFound -- if we see that error, then set refEpochFirstHeight to zero. This means that we have just joined the network, and that our local state cutoff (our root block) is newer than refEpochFirstHeight.

done in e397124 - does the comment on expected behaviour from setting refEpochFirstHeight = 0 match with what you expect?

To double-check: the minimum reference height to check for a duplicated transaction is approximately finalizedHeight - 2*DefaultTransactionExpiry, because of the following situation:

Collection C1 C2 ┏━━━━━━━━━━━┫ Transaction T1 T2 ┊ ┊ Reference Block R1──────────R2──────────Head <-Expiry-> <-Expiry->

The new collection C2 could barely include transaction T2; to ensure we deduplicate correctly, we need to check collection C1, which has reference block R1 (because it could just barely include transaction T1).

@tim-barry I would go off this function and this documentation.

When we are inserting C2, the range we need to check is [C2MinRefHeight-Expiry, C2MaxRefHeight]

C2MinRefHeight is both the reference block height of C2, and the smallest reference block height of all transactions in C2. C2MaxRefHeight is the largest reference block height of all transactions in C2.

storage/store/headers.go

The check was unintentionally crossing an epoch boundary and retrieving headers from a previous cluster chain. In addition, `ctx.refEpochFirstHeight` was never initialized. See #8222 (comment) for details

durkmurder · 2025-12-15T10:15:51Z

storage/store/headers.go

+		if header.ChainID != chainID {
+			return fmt.Errorf("expected chain ID %v, got %v: %w", chainID, header.ChainID, storage.ErrWrongChain)
+		}
+		if chainID.IsClusterChain() {


Why do we need this check? From what I see, operation.InsertHeader already ensures that we hold a lock.

Since we now know which kind of chain we are attempting to insert a Header for, we can be more granular about the specific lock required, and ensure we don't accidentally insert a main chain Header while only holding storage.LockInsertOrFinalizeClusterBlock, or vice versa.

Following up on this discussion:

we can be more granular about the specific lock required

I agree. Though, I think this finer granularity should live in the lowest level possible. Thereby, we guarantee that all code paths must go through the fine-grained check.

I would suggest to differentiate on the level of the storage methods: i.e. split InsertHeader into two more specialized methods.

cmd/scaffold.go

…D/storage init preInitFns can contain DynamicStartPreInit, which affects how the root snapshot is loaded. Since ChainID is read from the root snapshot to initialize storage if the node is not bootstrapped, it should come after preInit functions.

AlexHentschel

First batch of comments. Still reviewing ...

AlexHentschel · 2025-12-17T05:55:32Z

model/flow/chain.go

+// IsClusterChain returns whether the chain ID is for a collection cluster during an epoch, rather than a full network.
+func (c ChainID) IsClusterChain() bool {
+	return strings.HasPrefix(string(c), "cluster")
+}


I would highly discouraged to use string directly, hardcoded in the implementation. Up to this point, a cluster's chain ID could be entirely arbitrary. Hence using just "some string" was fine. That it started with cluster was solely for human readability. This is changing now, so we want to enforce that the same uniform convention is used everywhere in the code base.

You are establishing an archetype of an implementation pattern here, that other people might replicate. Please design the code such that

It is easy to verify that the code uses the same implementation everywhere. Your code should be very expressive that this is a binding convention, which must be followed everywhere.

Updating the convention is easy and has low probability of introducing bugs (this should automatically follow from a good implementation satisfying 1. So in a way, you can use 2. as a self-check for your own implementation whether your code satisfies 1. well).

With the current pattern, I see the risk that strings cluster appear in multiple different locations in the code. Hence, when changing the convention, we risk that once of the implementations is forgotten, creating a bug.

Suggestions:

Your implementation should convey (in code and documentation): this is a binding naming convention, that must be followed consistently!

Introducing a constant (e.g. ClusterIDPrefix) helps, see Mainnet constant for an example.

Diligently documenting the naming conventions as part of the constant's godoc would help a lot in my opinion. Then for every place using that constant, it is clear that they should be applying the same convention

We want to keep the code that checks whether something is a cluster chain and the code that generates the cluster chain ID close to each other. This helps engineers reviewing or working with the code tremendously to see the connection.

function IsClusterChain and CanonicalClusterID should live directly next to each other

This is cluster specific, so my preference would be putting the constant ClusterIDPrefix and the function IsClusterChain into the package state/cluster, right next to CanonicalClusterID and CanonicalRootBlock, which must also follow this convention. Unfortunately, I suspect this would create a circular dependency. If that is the case, I would suggest to put the constant ClusterIDPrefix, the generating function CanonicalClusterID and the checking function IsClusterChain all into the file model/flow/cluster.go

Please create a unit test that verifies the convention:

CanonicalClusterID generates chain IDs that start with the constant ClusterIDPrefix

IsClusterChain accepts those chain IDs generates by CanonicalClusterID

IsClusterChain rejects all identifiers that we use for the main chains of different networks (see AllChainIDs)

I would suggest to make the check inside IsClusterChain as tight as possible. The convention currently is:

flow-go/state/cluster/root_block.go

Line 13 in 502f81c

return flow.ChainID(fmt.Sprintf("cluster-%d-%s", epoch, participants.ID()))

you could specifically check with a regular expression for precisely the expected structure (integer for the epoch followed by hex string)

AlexHentschel · 2025-12-17T06:20:16Z

storage/store/headers.go

+	if chainID.IsClusterChain() {
+		panic("NewHeaders called on cluster chain ID - use NewClusterHeaders instead")
 	}


Using panics in production code is almost universally discouraged. We tolerate (still discourage) it only for struct-internal sanity checks that the struct itself should guarantee never fail. It should be documented why this panic should never happen and details on how the struct guarantees this.

The this constructor, I would like to request that an error (you can decide whether you want to throw an irrecoverable.exception or a generic error or a typed error) is returned instead of a panic, because we are dealing with an external input.

The reason is that errors quite nicely preserve the call stack of where the error happened. In contrast, the panic just terminates the program without much information which call stack led to the panic ... which makes debugging very cumbersome.

Suggestion:

I feel this PR has already quite a large change surface. You could include a TODO, indicating that this panic will be changed to an error return later.

Please also emphasize this in your PR description, specifically referencing the affected relevant code locations. This manages the expectation of reviewers of this PR. AND it helps reviewers of your subsequent PR to verify that really all panics have been replaced by error returns.

AlexHentschel · 2025-12-17T06:20:56Z

storage/store/headers.go

+	if !chainID.IsClusterChain() {
+		panic("NewClusterHeaders called on non-cluster chain ID - use NewHeaders instead")
+	}


Please see my prior comment and suggestion to replace the panic by an error return

AlexHentschel · 2025-12-17T06:23:57Z

storage/store/headers.go

-		err := operation.RetrieveHeader(r, blockID, &header)
-		return &header, err
+// It supports storing, caching and retrieving by block ID, and additionally indexes by header height and view.
+func NewHeaders(collector module.CacheMetrics, db storage.DB, chainID flow.ChainID) *Headers {


Please document the requirements on the chainID input - I would reference the constant ClusterIDPrefix for further reading.

AlexHentschel · 2025-12-17T06:24:06Z

storage/store/headers.go

+// It supports storing, caching and retrieving by block ID, and additionally an index by header height.
+func NewClusterHeaders(collector module.CacheMetrics, db storage.DB, chainID flow.ChainID) *Headers {


Please document the requirements on the chainID input - I would reference the constant ClusterIDPrefix for further reading.

storage/store/headers.go

AlexHentschel · 2025-12-17T06:52:02Z

storage/store/headers.go

 // ByBlockID returns the header with the given ID. It is available for finalized blocks and those pending finalization.
 // Error returns:
 //   - [storage.ErrNotFound] if no block header with the given ID exists
+//   - [storage.ErrWrongChain] if the block header exists in the database but is part of a different chain than expected


This needs to be documented as part of the Headers interface. Please make sure you always keep the interface and implementation documentation consistent.

AlexHentschel · 2025-12-17T07:13:04Z

storage/store/headers.go

I very much like the error return for the method:

flow-go/storage/store/headers.go

Lines 182 to 188 in e397124

// ByBlockID returns the header with the given ID. It is available for finalized blocks and those pending finalization.

// Error returns:

// - [storage.ErrNotFound] if no block header with the given ID exists

// - [storage.ErrWrongChain] if the block header exists in the database but is part of a different chain than expected

func (h *Headers) ByBlockID(blockID flow.Identifier) (*flow.Header, error) {

return h.retrieveTx(blockID)

}

!

However, there are other methods that are conceptually very similar, whose implementations are still completely oblivious about the separation of cluster and main blocks:

flow-go/storage/headers.go

Lines 29 to 31 in c1c435a

// Exists returns true if a header with the given ID has been stored.

// No errors are expected during normal operation.

Exists(blockID flow.Identifier) (bool, error)

flow-go/storage/headers.go

Lines 38 to 45 in c1c435a

// ByParentID finds all children for the given parent block. The returned headers

// might be unfinalized; if there is more than one, at least one of them has to

// be unfinalized.

// CAUTION: this method is not backed by a cache and therefore comparatively slow!

//

// Expected error returns during normal operations:

// - [storage.ErrNotFound] if no block with the given parentID is known

ByParentID(parentID flow.Identifier) ([]*flow.Header, error)

flow-go/storage/headers.go

Lines 47 to 51 in c1c435a

// ProposalByBlockID returns the header with the given ID, along with the corresponding proposer signature.

// It is available for finalized blocks and those pending finalization.

// Error returns:

// - [storage.ErrNotFound] if no block header or proposer signature with the given blockID exists

ProposalByBlockID(blockID flow.Identifier) (*flow.ProposalHeader, error)

In all cases, asking the cluster-bound Headers for a consensus block should return ErrWrongChain and vice versa.

Please keep the documentation of the interface and implementation consistent and include tests confirming that the documented error type is returned by the implementation as expected.

Related request

Currently, the interface documentation is too concise in my opinion, as it does not provide a broader picture

flow-go/storage/headers.go

Lines 7 to 8 in c1c435a

// Headers represents persistent storage for blocks.

type Headers interface {

Please extend the interface documentation:

Point out that in general, multiple instances might exist in parallel inside a single node. Explain it on the example of the collectors storing headers of their own cluster consensus as well as the main consensus. Emphasize that care should be taken to interact with the correct instance, because otherwise the implementation will return [storage.ErrWrongChain]

Emphasize that implementations for the cluster consensus do not yet support lookups by view.

add a dedicated sentinel (e.g. ErrNotAvailableForClusterConsensus)

document that this sentinel might be returned by the ByView method.

check error type in test

AlexHentschel · 2025-12-17T07:14:15Z

storage/store/headers_test.go

Pleas add test cases confirming that the expected sentinel error types are returned for all relevant methods. Rule of thumb:

Adding a new sentinel as a possible error as a possible return to some method? Add a test case confirming that exactly that type is produced by the implementation.

Adding a new exception / generic error as a possible return to some method? Add a test case confirming that the function errors for the expected conditions, and the implementation does not misrepresent the exception with a documented sentinel indicating a benign error case.

utils/unittest/cluster_block.go

This distinction allows more granularity with which locks are required. Also similarly split up definition of the storeWithLock functor used by Header storage.

Generation and checking for cluster ChainIDs are now next to each other, currently in the `cluster/state` package (instead of the `model/flow` package where the ChainID type and standard chainIDs are defined). Switched to testing the full chainID with a regex instead of just the prefix.

AlexHentschel

Next batch of comments. Still reviewing.

Concerns about function `GetChainIDFromLatestFinalizedHeader` and `GetLatestFinalizedHeader` returning `storage.ErrNotFound` errors

Please see this and that comment for details. Instead of addressing the challenge on the level of GetChainIDFromLatestFinalizedHeader and GetLatestFinalizedHeader, I would recommend going to the lowest sensible level, i.e. functions RetrieveFinalizedHeight and RetrieveSealedHeight.

Working on this code is also a great opportunity to add the missing documentation to the functions RetrieveFinalizedHeight and RetrieveSealedHeight 😉

Specifically, I would recommend to add the following dedicated error to storage/operation/heights.go

var (
	// IncompleteStateError indicates that some information cannot be retrieved from the database,
	// which the protocol mandates to be present. This can be a symptom of a corrupted state
	// or an incorrectly / incompletely bootstrapped node. In most cases, this is an exception.
	//
	// ATTENTION: in most cases, [IncompleteStateError] error is a symptom of a corrupted state
	// or an incorrectly / incompletely bootstrapped node. Typically, this is an unexpected exception
	// and should not be checked for the same way as benign sentinel errors.
	IncompleteStateError = errors.New("data required by protocol is missing in database")
)

Then, utilize the knowledge that the requested values should always be present for a properly bootstrapped node in the implementation of RetrieveFinalizedHeight and RetrieveSealedHeight:

// RetrieveFinalizedHeight reads height of the latest finalized block directly from the database.
//
// During bootstrapping, the latest finalized block and its height are indexed and thereafter the
// latest finalized heigh is only updated (but never removed). Hence, for a properly bootstrapped
// node, this function should _always_ return a proper value.
//
// CAUTION: This function should only be called on properly bootstrapped nodes. If the state is
// corrupted or the node is not properly bootstrapped, this function may return [IncompleteStateError].
// The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often
// benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present.
//
// No error returns are expected during normal operations.
func RetrieveFinalizedHeight(r storage.Reader, height *uint64) error {
	var h uint64
	err := RetrieveByKey(r, MakePrefix(codeFinalizedHeight), &h)
	if err != nil {
		// mask the lower-level error to  prevent confusion with benign the often benign `storage.ErrNotFound`:
		return fmt.Errorf("latest finalized height could not be read, which should never happen for bootstrapped nodes: %w", IncompleteStateError)
	}
	*height = h
	return nil
}

// RetrieveSealedHeight reads height of the latest sealed block directly from the database.
//
// During bootstrapping, the latest sealed block and its height are indexed and thereafter the
// latest sealed heigh is only updated (but never removed). Hence, for a properly bootstrapped
// node, this function should _always_ return a proper value.
//
// CAUTION: This function should only be called on properly bootstrapped nodes. If the state is
// corrupted or the node is not properly bootstrapped, this function may return [IncompleteStateError].
// The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often
// benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present.
func RetrieveSealedHeight(r storage.Reader, height *uint64) error {
	var h uint64
	err := RetrieveByKey(r, MakePrefix(codeSealedHeight), &h)
	if err != nil {
		// mask the lower-level error to prevent confusion with benign the often benign `storage.ErrNotFound`:
		return fmt.Errorf("latest sealed height could not be read, which should never happen for bootstrapped nodes: %w", IncompleteStateError)
	}
	*height = h
	return nil
}

Please add tests confirming the correct error returns: IncompleteStateError, not ErrNotFound 🙏

storage/store/init.go

cmd/util/cmd/common/storage.go

AlexHentschel · 2025-12-18T01:59:25Z

state/protocol/badger/state.go

+// GetChainIDFromLatestFinalizedHeader attempts to retrieve the consensus chainID
+// from the latest finalized header in the database, before storage or protocol state have been initialized.
+// Expected errors during normal operations:
+// - [storage.ErrNotFound] if the node is not bootstrapped.
+func GetChainIDFromLatestFinalizedHeader(db storage.DB) (flow.ChainID, error) {
+	h, err := GetLatestFinalizedHeader(db)
+	if err != nil {
+		return "", err


Concerns about the returned error type

I have mixed feelings about this. Normally, the approach you took here is exactly the pattern I would encourage engineers to follow. Typically, from the perspective of the low-level logic, requested data being absent is not necessarily a conclusive sign of state corruption. So we typically just escalate the error, document it properly and let the caller decide.

But but but 😅 ... why would it be legitimate to read data from an uninitialized node? This should never happen. Unless the higher level logic has a bug. In addition, we offer the method IsBootstrapped, by which the caller can check whether the node is bootstrapped. To be clear: I am essentially arguing that we should preempt usage patters (I typically discourage this). The reason is that storage.ErrNotFound is often a benign error but in this case it is most likely not. And if it is mistaken for being benign, by just throwing it up the call stack and letting the top level method decide, which might also have code paths expected to throw benign ErrNotFound then we are in trouble.

Therefore, I would suggest we throw anything but a ErrNotFound error here and explicitly document why.

Function naming

Would suggest to rename this to GetChainID. How precisely we do that is largely in implementation detail, which is already covered in the documentation. This detail is not important enough in my opinion to emphasize it as part of the function name.

potential documentation ambiguity

before storage or protocol state have been initialized.

I feel this wording is a bit ambiguous. The storage and protocol state have to be initialized and persisted to the database. This is essentially the bootstrapping. I think what you are trying to say is that this function reads directly from the database, without instantiating high-level storage abstractions or protocol state structs (?)

Suggestion

Suggested change

// GetChainIDFromLatestFinalizedHeader attempts to retrieve the consensus chainID

// from the latest finalized header in the database, before storage or protocol state have been initialized.

// Expected errors during normal operations:

// - [storage.ErrNotFound] if the node is not bootstrapped.

func GetChainIDFromLatestFinalizedHeader(db storage.DB) (flow.ChainID, error) {

h, err := GetLatestFinalizedHeader(db)

if err != nil {

return "", err

// GetChainID retrieves the consensus chainID from the latest finalized block in the database. This

// function reads directly from the database, without instantiating high-level storage abstractions

// or the protocol state struct.

//

// During bootstrapping, the latest finalized block and its height are indexed and thereafter the

// latest finalized heigh is only updated (but never removed). Hence, for a properly bootstrapped node,

// this function should _always_ return a proper value (constant throughout the lifetime of the node).

//

// Note: This function should only be called on properly bootstrapped nodes. If the state is corrupted

// or the node is not properly bootstrapped, this function may return [IncompleteStateError].

// The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often

// benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present.

//

// No error returns are expected during normal operations.

func GetChainID(db storage.DB) (flow.ChainID, error) {

h, err := GetLatestFinalizedHeader(db) // returns [operation.IncompleteStateError] if the required data is not found, but never storage.ErrNotFound.

if err != nil {

return "", fmt.Errorf("failed to read latest finalized block, which should never happen for bootstrapped nodes; call IsBootstrapped upfront if in doubt: %w", err)

Yes, you are correct about the intention (reading directly from the database before instantiating the higher-level interfaces).
Good to know that you consider this situation an exception to the default in terms of errors; I agree that we should be using IsBootstrapped anyways.

AlexHentschel · 2025-12-18T04:51:19Z

state/protocol/badger/state.go

+// GetLatestFinalizedHeader attempts to retrieve the latest finalized header
+// without going through the storage.Headers interface.
+// Expected errors during normal operations:
+// - [storage.ErrNotFound] if the node is not bootstrapped.
+func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) {
+	var finalized uint64
+	r := db.Reader()
+	err := operation.RetrieveFinalizedHeight(r, &finalized)
+	if err != nil {
+		return nil, err


same here: would suggest to return anything but an storage.ErrNotFound error. As this is deviating from a common convention, we should call it out and explain the reasoning:

Suggested change

// GetLatestFinalizedHeader attempts to retrieve the latest finalized header

// without going through the storage.Headers interface.

// Expected errors during normal operations:

// - [storage.ErrNotFound] if the node is not bootstrapped.

func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) {

var finalized uint64

r := db.Reader()

err := operation.RetrieveFinalizedHeight(r, &finalized)

if err != nil {

return nil, err

// GetLatestFinalizedHeader retrieves the header of the latest finalized block. This function reads directly

// from the database, without instantiating high-level storage abstractions or the protocol state struct.

//

// During bootstrapping, the latest finalized block and its height are indexed and thereafter the latest

// finalized heigh is only updated (but never removed). Hence, for a properly bootstrapped node, this

// function should _always_ return a proper value.

//

// Note: This function should only be called on properly bootstrapped nodes. If the state is corrupted

// or the node is not properly bootstrapped, this function may return [IncompleteStateError].

// The reason for not returning [storage.ErrNotFound] directly is to avoid confusion between an often

// benign [storage.ErrNotFound] and failed reads of quantities that the protocol mandates to be present.

//

// No error returns are expected during normal operations.

func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) {

var finalized uint64

r := db.Reader()

err := operation.RetrieveFinalizedHeight(r, &finalized) // returns [operation.IncompleteStateError] if the required data is not found, but never storage.ErrNotFound.

if err != nil {

return nil, fmt.Errorf("failed to read latest finalized height, which should never happen for bootstrapped nodes; call IsBootstrapped upfront if in doubt: %w", err)

state/cluster/root_block.go

AlexHentschel · 2025-12-18T06:23:32Z

storage/store/headers.go


 var _ storage.Headers = (*Headers)(nil)

 // NewHeaders creates a Headers instance, which stores block headers.


Suggested change

// NewHeaders creates a Headers instance, which stores block headers.

// NewHeaders creates a Headers instance, which manages block headers of the main consensus (not cluster consensus).

storage/operation/headers.go

AlexHentschel · 2025-12-18T06:36:32Z

storage/operation/headers.go

+// This error allows the caller to detect duplicate inserts. If the header is stored along with other parts
+// of the block in the same batch, similar duplication checks can be skipped for storing other parts of the block.
+// No other error returns are expected during normal operation.
+func InsertClusterHeader(lctx lockctx.Proof, rw storage.ReaderBatchWriter, headerID flow.Identifier, header *flow.Header) error {


Please add a test analogous to TestHeaderInsertCheckRetrieve for the cluster operations.

In addition, tests that verify that InsertHeader and InsertClusterHeader fail if the wrong or no lock is acquired would be great (those tests are missing at the moment entirely for both methods).

module/builder/collection/builder.go

AlexHentschel · 2025-12-18T07:29:25Z

module/builder/collection/builder.go

 	ctx.refChainFinalizedHeight = mainChainFinalizedHeader.Height
 	ctx.refChainFinalizedID = mainChainFinalizedHeader.ID()

+	// If we don't have the epoch boundaries (first/final height ON MAIN CHAIN) cached, try retrieve and cache them


I am not sure if the comment is misleading or I am misunderstanding it 😅. I think this block of code does not do anything with the "final height", right?

Suggested change

// If we don't have the epoch boundaries (first/final height ON MAIN CHAIN) cached, try retrieve and cache them

// We can't specify the height of the epoch's first consensus block (height ON MAIN CHAIN) during which this cluster is

// active, because the builder is typically _instantiated_ before the epoch starts. However, the builder should only be

// called once the epoch has started, i.e. consensus has finalized the first block in the epoch. Consequently, we

// retrieve the epoch's first height on the first call of the builder, and cache it for future calls.

AlexHentschel · 2025-12-18T07:38:00Z

module/builder/collection/builder.go

+				// can be missing if we joined (dynamic bootstrapped) in the middle of an epoch.
+				// 0 means FinalizedAncestryLookup will not be bounded by the epoch start,
+				// but only by which cluster blocks we have available.
+				refEpochFirstHeight = 0


⚠️ I am not sure this is correct

Assume the following situation:

the collector joined late (dynamic bootstrapped)

Some access node sends it a transaction that was already included before the collector joined (AN might be byzantine, or AN might be behind).

The collector can't just scan it's locally known history and include the transaction if it doesn't appear in any blocks it knows. It has to guarantee that the transaction does no appear in the fork, no matter how little of the fork's history the collector node knows. Otherwise, it cannot propose (or risks to be slashed).

Hence, I would conclude that just scanning the "cluster blocks we have available" is insufficient. Lets make sure we think this properly through and document the reasoning why the algorithm also works for short histories.

I'll try to look at this tomorrow again with a fresh brain. Very well possible that the argument can be deduced from the code (we should document it nonetheless). If we can't work this out, lets try to ask Jordan for advice.

I think you are right that there is a potential issue here.

module/builder/collection/builder_test.go

Co-authored-by: Alexander Hentschel <[email protected]>

AlexHentschel · 2025-12-19T04:35:06Z

module/builder/collection/builder.go

-	start = minRefHeight - flow.DefaultTransactionExpiry + 1
-	if start > minRefHeight {
-		start = 0 // overflow check
+func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) {


⚠️ I think there is a sign flip in the "+1" term.

Overall, I think it would help to document the computation we perform here in a lot of detail, and tie it to the logic calling this function. Suggestion:

Suggested change

func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) {

// findRefHeightSearchRangeForConflictingClusterBlocks computes the range of reference block heights of ancestor blocks

// which could possibly contain transactions duplicating those in our collection under construction, based on the range

// of reference heights of transactions in the collection under construction.

// Input range is the (inclusive) range of reference heights of transactions eligible for inclusion in the collection

// under construction. Output range is the (inclusive) range of reference heights which need to be searched in order to

// avoid transaction repeats.

//

// Within a single epoch, we have argued that for a set of transactions, with `minRefHeight` (`maxRefHeight`) being

// the smallest (largest) reference block height, we only need to inspect collections with reference block heights

// c ∈ (minRefHeight-E, maxRefHeight]. Note that the lower bound is exclusive, while the upper bound is inclusive,

// which we transform to an inclusive range:

//

// c ∈ (minRefHeight-E, maxRefHeight]

// ⇔ c ∈ [minRefHeight-E+1, maxRefHeight]

//

// In order to take epoch boundaries into account, we note: A collector cluster is only responsible for transactions whose

// reference blocks are within the cluster's operating epoch. Thus, we can bound the lower end of the search range by the

// height of the first block in the epoch. Formally, we only need to inspect collections with reference block height

//

// c ∈ [max{minRefHeight-E+1, epochFirstHeight}, maxRefHeight]

func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) {

// in order to avoid underflow, we rewrite the lower-bound equation entirely without subtraction:

// max{minRefHeight-E+1, epochFirstHeight} == epochFirstHeight

// ⇔ minRefHeight - E + 1 ≤ epochFirstHeight

// ⇔ minRefHeight - E < epochFirstHeight

// ⇔ minRefHeight < epochFirstHeight + E

if minRefHeight < ctx.refEpochFirstHeight+flow.DefaultTransactionExpiry {

return ctx.refEpochFirstHeight, maxRefHeight

}

// We reach the following line only if minRefHeight-E+1 > epochFirstHeight ≥ 0. Hence, an underflow is impossible.

return minRefHeight + 1 - flow.DefaultTransactionExpiry, maxRefHeight

}

coderabbitai · 2025-12-20T01:21:17Z

Walkthrough

This PR implements chain ID awareness throughout the Flow storage layer, enabling the differentiation between consensus and cluster headers. The changes bind a chain ID to each Headers instance at construction, introduce cluster-specific header operations, add chain validation during header retrieval, and thread the chain ID parameter through storage initialization across the codebase.

Changes

Cohort / File(s)	Summary
Storage API Enhancement `storage/store/headers.go`, `storage/store/init.go`	Added new constructors `NewHeaders(collector, db, chainID)` and `NewClusterHeaders(collector, db, chainID)` that bind a chain ID to Headers instances. Updated `InitAll` to accept and forward chainID. Headers now validate that retrieved data matches their bound chain ID and return `ErrWrongChain` on mismatch.
Chain ID Retrieval Utilities `state/protocol/badger/state.go`	Introduced `GetChainIDFromLatestFinalizedHeader` and `GetLatestFinalizedHeader` functions to extract chain ID from database for initialization purposes.
Storage Error Handling `storage/errors.go`, `storage/headers.go`, `storage/store/blocks.go`	Added `ErrWrongChain` sentinel error. Extended documentation to reflect chain validation semantics in Headers and Blocks interfaces.
Cluster Header Operations `storage/operation/headers.go`, `storage/operation/cluster.go`	Added `InsertClusterHeader` function for cluster-specific header insertion with appropriate locking. Updated `InsertClusterBlock` to use cluster header insertion.
Command Utilities - Chain ID Propagation `cmd/scaffold.go`, `cmd/util/cmd/common/storage.go`	Introduced `determineChainID()` in scaffold to establish chain ID before storage initialization. Extended `InitStorages` signature to accept and forward chainID parameter.
Command Tools - Data Export `cmd/util/cmd/exec-data-json-export/`, `cmd/util/cmd/read-badger/cmd/`, `cmd/util/cmd/read-protocol-state/cmd/`, `cmd/util/cmd/rollback-executed-height/cmd/`, `cmd/util/cmd/snapshot/cmd.go`, `cmd/util/cmd/verify-evm-offchain-replay/verify.go`	Updated all invocations to retrieve chainID via `badgerstate.GetChainIDFromLatestFinalizedHeader` and pass it to storage initialization and header construction.
Admin Commands `admin/commands/storage/read_range_cluster_blocks.go`, `cmd/collection/main.go`	Refactored `NewReadRangeClusterBlocksCommand` to construct cluster headers internally using chainID instead of accepting pre-constructed headers. Removed headers parameter from constructor signature.
Cluster State Mutation `state/cluster/badger/mutator.go`, `state/cluster/badger/mutator_test.go`	Restructured MutableState to maintain separate `clusterHeaders`, `clusterPayloads`, and `consensusHeaders` fields instead of generic headers/payloads. Updated all retrieval logic to use appropriate header source.
Epoch Management `engine/collection/epochmgr/engine.go`, `engine/collection/epochmgr/factories/*`	Extended `EpochComponentsFactory.Create` to accept `chainID` parameter and propagate it through cluster state factory invocation. Updated factory signatures and mock implementations.
Collection Builder `module/builder/collection/builder.go`	Added epochFirstHeight caching and updated `findRefHeightSearchRangeForConflictingClusterBlocks` to accept build context and respect epoch boundaries when searching for conflicting cluster blocks.
Collection Builder Tests `module/builder/collection/builder_test.go`	Updated test suite to construct and manage separate `consensusHeaders`, `clusterHeaders`, and `clusterPayloads` fields. Modified all builder/state initialization calls to pass correct header instances.
Cluster and Consensus Tests `state/cluster/badger/*_test.go`, `state/protocol/badger/mutator_test.go`, `consensus/integration/nodes_test.go`	Updated `InitAll` calls to include `flow.Emulator` parameter. Introduced `NewClusterHeaders` construction in cluster state tests. Updated mutable state initialization to pass cluster and consensus headers.
Test Fixtures and Utilities `utils/unittest/cluster_block.go`, `state/cluster/root_block.go`	Updated `ClusterBlockFixture` and Genesis block creation to use canonical cluster IDs via `CanonicalClusterID`. Added `IsCanonicalClusterID` validation function and `ClusterChainPrefix` constant.
Integration and Engine Tests `engine/access/access_test.go`, `engine/access/ingestion/collections/indexer_test.go`, `engine/access/rpc/backend/transactions/*_test.go`, `engine/collection/compliance/engine_test.go`, `engine/execution/pruner/core_test.go`, `engine/testutil/nodes.go`, `engine/verification/verifier/verifiers.go`, `integration/testnet/container.go`, `module/block_iterator/iterator_test.go`, `module/finalizedreader/finalizedreader_test.go`, `module/finalizer/consensus/finalizer_test.go`	Updated storage initialization calls throughout to include new `chainID` and/or `flow.Emulator` parameters. Updated header construction to pass chain context.
Storage Tests `storage/store/*_test.go`	Updated test suite to call `InitAll` and `NewHeaders` with chain ID and emulator parameters. Updated cluster block test fixtures with canonical cluster IDs.
Documentation and Minor Fixes `model/flow/chain.go`, `integration/tests/access/cohort4/access_test.go`, `network/channels/errors_test.go`, `engine/collection/test/cluster_switchover_test.go`	Fixed typos and clarified comments; no functional changes.

Sequence Diagram(s)

sequenceDiagram
    participant Init as Initialization
    participant ChainID as Chain ID Resolver
    participant Storage as Storage Layer
    participant Headers as Headers Store
    
    Init->>ChainID: determineChainID() or<br/>GetChainIDFromLatestFinalizedHeader()
    ChainID->>Storage: Query latest finalized header
    Storage-->>ChainID: Return header with ChainID
    ChainID-->>Init: Resolved chainID
    
    Init->>Storage: InitAll(metrics, db, chainID)
    Storage->>Headers: NewHeaders(collector, db, chainID) or<br/>NewClusterHeaders(collector, db, chainID)
    Headers->>Headers: Store chainID internally
    Headers-->>Storage: Headers instance bound to chainID
    Storage-->>Init: All storage initialized
    
    Note over Headers: Later retrieval operations<br/>validate header.ChainID<br/>== configured chainID

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring special attention:

storage/store/headers.go: Chain validation logic in retrieve path; understanding the distinction between NewHeaders (consensus) and NewClusterHeaders (cluster) and their differing lock/height index behaviors
state/cluster/badger/mutator.go: Complex restructuring to maintain three header sources (clusterHeaders, clusterPayloads, consensusHeaders) and ensure correct source is used in each validation context
module/builder/collection/builder.go: Epoch-aware reference block search with boundary clamping; understanding the new blockBuildContext and epoch-first-height caching logic
engine/collection/epochmgr/factories/epoch.go and cluster_state.go: Chain ID threading through factory methods and return value reshaping; verify correct header assignments
Cluster block fixture updates: Verify that CanonicalClusterID generation is consistent across test files and matches expected format
Cross-chain error scenarios: Verify that ErrWrongChain is properly returned and handled in all chain validation points

Poem

🐰 Headers split in two, consensus and cluster so true,
Each bound to their chain, no more confusion's stain,
We fetch the ID first, before storage's quenched thirst,
Wrong chain? We reject with grace—ErrWrongChain saves the case!
Epoch boundaries respected, builder's logic perfected,
One chain, one truth, at last! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.63% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Differentiate between Consensus and Cluster Headers storage' accurately describes the main objective of the PR, which involves binding ChainID to Headers instances and separating consensus/cluster header handling.
Linked Issues check	✅ Passed	The PR fulfills all primary coding objectives from #4204: ChainID binding at construction, chain-specific header validation returning ErrWrongChain, appropriate height index usage, cluster/consensus header separation via NewHeaders/NewClusterHeaders, and builder transaction validation.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with #4204 objectives. Updates to command-line tools, tests, and utilities consistently implement the ChainID-aware header storage pattern. No unrelated refactoring or feature additions detected.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch tim/4204-split-header-storage-by-chainid

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

storage/operation/headers.go (1)

57-73: LGTM: Cluster header insertion mirrors the consensus header pattern.

InsertClusterHeader correctly uses the cluster-specific lock (LockInsertOrFinalizeClusterBlock) while maintaining the same existence-check-before-upsert pattern as InsertHeader. Both functions share the same key namespace (codeHeader), which is intentional since headers are globally unique by ID regardless of whether they come from consensus or cluster chains.

Reminder: A past review comment requested adding tests analogous to TestHeaderInsertCheckRetrieve for cluster operations, as well as tests verifying lock acquisition. Please ensure these are addressed.
module/builder/collection/builder.go (1)
645-651: Potential off-by-two error in search range calculation.

A past review identified a sign flip issue in this function. The mathematical derivation states the search range should be:
c ∈ [max{minRefHeight - E + 1, epochFirstHeight}, maxRefHeight]
However, the current implementation uses delta = E + 1, resulting in:

minRefHeight - delta = minRefHeight - E - 1

This differs from the expected minRefHeight - E + 1 by 2.

Example with E=600, epochFirstHeight=100, minRefHeight=1500:

Expected start: max{1500 - 600 + 1, 100} = 901

Current code: 1500 - 601 = 899

This could cause the search to start 2 blocks earlier than necessary (minor performance impact) or potentially miss edge cases.
🔎 Suggested fix based on past review analysis
 func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) {
-	delta := uint64(flow.DefaultTransactionExpiry + 1)
-	if minRefHeight <= ctx.refEpochFirstHeight+delta {
+	// We need to search collections with reference height c ∈ (minRefHeight-E, maxRefHeight]
+	// Converting to inclusive range: c ∈ [minRefHeight-E+1, maxRefHeight]
+	// Bounded by epoch start: c ∈ [max{minRefHeight-E+1, epochFirstHeight}, maxRefHeight]
+	if minRefHeight < ctx.refEpochFirstHeight+flow.DefaultTransactionExpiry {
 		return ctx.refEpochFirstHeight, maxRefHeight // bound at start of epoch
 	}
-	return minRefHeight - delta, maxRefHeight
+	// Safe from underflow since minRefHeight >= epochFirstHeight + E >= E
+	return minRefHeight + 1 - flow.DefaultTransactionExpiry, maxRefHeight
 }

🧹 Nitpick comments (8)

network/channels/errors_test.go (1)
32-33: Consider implementing the previously suggested solution using package channels_test.

While the comment documents the circular dependency workaround, the previous review discussion identified a cleaner solution: changing the package declaration to package channels_test (line 1). This is a standard Go testing pattern that would eliminate the circular dependency and allow using CanonicalClusterID consistently with the rest of the codebase, as you acknowledged would work.
🔎 Proposed refactor to use the test package pattern

At the top of the file (line 1):
-package channels
+package channels_test
Then update line 16 to use the package qualifier:
-	err := NewInvalidTopicErr(topic, wrapErr)
+	err := channels.NewInvalidTopicErr(topic, wrapErr)
Similarly for line 35:
-	err := NewUnknownClusterIdErr(clusterId, activeClusterIds)
+	err := channels.NewUnknownClusterIdErr(clusterId, activeClusterIds)
And update the Is* function calls (lines 23, 27, 42, 46) with the channels. qualifier.

Then replace lines 32-33:
-	// NOTE: we do not use CanonicalClusterID here to avoid circular dependency
-	clusterId := flow.ChainID("cluster-id")
+	clusterId := flow.CanonicalClusterID("cluster-id")
integration/tests/access/cohort4/execution_data_pruning_test.go (1)
166-166: LGTM! Chain ID correctly matches the network configuration.

The addition of flow.Localnet as the chain ID parameter is consistent with the network setup on line 129 where testnet.PrepareFlowNetwork is called with flow.Localnet.
Optional: Consider creating Observer Node headers for consistency

While the current implementation works correctly (using Access Node headers to look up blocks for both nodes), you could optionally create a separate headers instance for the Observer Node for consistency:
onHeaders := store.NewHeaders(metrics, onDB, flow.Localnet)
This isn't necessary for the test's current logic since both nodes are on the same network and have identical headers, but it would make the Observer Node setup symmetric with the Access Node setup.
cmd/util/cmd/exec-data-json-export/transaction_exporter.go (1)
52-64: LGTM! Consider extracting the chainID retrieval pattern.

The chainID retrieval and header initialization logic is correct. However, this exact pattern is duplicated across multiple exporter files (transaction_exporter.go, delta_snapshot_exporter.go, result_exporter.go, event_exporter.go, and block_exporter.go).
Optional: Extract common pattern into a helper function

Consider creating a helper function in the common package to reduce duplication:
// In cmd/util/cmd/common/storage.go
func InitHeadersWithChainID(db storage.DB, cacheMetrics module.CacheMetrics) (*store.Headers, error) {
    chainID, err := badgerstate.GetChainIDFromLatestFinalizedHeader(db)
    if err != nil {
        return nil, err
    }
    return store.NewHeaders(cacheMetrics, db, chainID), nil
}
Then use it in exporters:
headers, err := common.InitHeadersWithChainID(db, cacheMetrics)
if err != nil {
    return err
}
storage/badger/all.go (1)
13-14: Deprecation comment format should follow Go conventions.

The deprecation comment should use the standard Go Deprecated: format to be recognized by tools like go vet and IDEs.
🔎 Suggested fix
-// deprecated: use [store.InitAll] instead
+// Deprecated: use [store.InitAll] instead.
 func InitAll(metrics module.CacheMetrics, db *badger.DB, chainID flow.ChainID) *storage.All {
cmd/util/cmd/read-badger/cmd/cluster_blocks.go (1)
37-43: Consider adding validation before calling NewClusterHeaders.

If a user provides a non-cluster chain ID (e.g., flow-emulator), NewClusterHeaders will panic. For a CLI tool, a user-friendly error message might be preferable.
🔎 Proposed validation
 		// get chain id
 		log.Info().Msgf("got flag chain name: %s", flagChainName)
 		chainID := flow.ChainID(flagChainName)
+		if !cluster.IsCanonicalClusterID(chainID) {
+			return fmt.Errorf("chain ID %q is not a valid cluster chain ID", chainID)
+		}

 		clusterHeaders := store.NewClusterHeaders(metrics, db, chainID)
This would require importing the cluster package.
state/protocol/badger/state.go (1)
992-1032: Error handling partially addresses past feedback but remains inconsistent.

Based on past review comments from AlexHentschel, there was a suggestion to avoid returning storage.ErrNotFound directly and instead use IncompleteStateError to distinguish between benign not-found errors and actual state corruption. The current implementation:

Line 1013: Returns bare error from RetrieveFinalizedHeight, which could be storage.ErrNotFound

Lines 1018-1020, 1026-1028: Wraps ErrNotFound with contextual messages but doesn't wrap as IncompleteStateError

This creates inconsistency where the same sentinel error (storage.ErrNotFound) is returned differently depending on which lookup fails. Consider applying the suggested pattern from past reviews to wrap all not-found errors consistently.
🔎 Suggested consistent error wrapping
 func GetLatestFinalizedHeader(db storage.DB) (*flow.Header, error) {
 	var finalized uint64
 	r := db.Reader()
 	err := operation.RetrieveFinalizedHeight(r, &finalized)
 	if err != nil {
+		if errors.Is(err, storage.ErrNotFound) {
+			return nil, fmt.Errorf("could not retrieve finalized height: node may not be bootstrapped: %w", err)
+		}
 		return nil, err
 	}
state/cluster/badger/mutator_test.go (1)
447-452: Test re-enabled and validates cluster reference block rejection.

This test was previously skipped and is now enabled to verify that using a cluster block as a reference block is rejected. For consistency with similar tests in this file (e.g., TestExtend_WithReferenceBlockFromDifferentEpoch at lines 466-469), consider adding an assertion on the error type.
🔎 Optional: Add error type assertion
 func (suite *MutatorSuite) TestExtend_WithReferenceBlockFromClusterChain() {
 	// set genesis from cluster chain as reference block
 	proposal := suite.ProposalWithParentAndPayload(suite.genesis, *model.NewEmptyPayload(suite.genesis.ID()))
 	err := suite.state.Extend(&proposal)
 	suite.Assert().Error(err)
+	suite.Assert().True(state.IsInvalidExtensionError(err))
 }
storage/store/headers.go (1)
80-83: Document the unsupported ByView behavior for cluster headers.

The retrieveView function returns an error for cluster headers, but this limitation should be documented in the storage.Headers interface's ByView method. Users of cluster headers should know that view-based lookups are not available.

Per past review comments from AlexHentschel, consider adding a dedicated sentinel error (e.g., ErrNotAvailableForClusterConsensus) and documenting it in the interface.
🔎 Suggested sentinel error
// In storage/errors.go, add:
var ErrNotAvailableForClusterConsensus = errors.New("operation not available for cluster consensus")
Then in retrieveView:
retrieveView := func(r storage.Reader, height uint64) (flow.Identifier, error) {
    var id flow.Identifier
-   return id, fmt.Errorf("retrieve by view not implemented for cluster headers")
+   return id, fmt.Errorf("ByView not supported for cluster headers: %w", storage.ErrNotAvailableForClusterConsensus)
}

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0258ee and 6613eda.

📒 Files selected for processing (82)

admin/commands/storage/read_range_cluster_blocks.go (3 hunks)
cmd/collection/main.go (1 hunks)
cmd/scaffold.go (2 hunks)
cmd/util/cmd/common/storage.go (2 hunks)
cmd/util/cmd/exec-data-json-export/block_exporter.go (2 hunks)
cmd/util/cmd/exec-data-json-export/delta_snapshot_exporter.go (2 hunks)
cmd/util/cmd/exec-data-json-export/event_exporter.go (2 hunks)
cmd/util/cmd/exec-data-json-export/result_exporter.go (2 hunks)
cmd/util/cmd/exec-data-json-export/transaction_exporter.go (2 hunks)
cmd/util/cmd/export-json-transactions/cmd.go (2 hunks)
cmd/util/cmd/export-json-transactions/transactions/range_test.go (1 hunks)
cmd/util/cmd/find-inconsistent-result/cmd.go (2 hunks)
cmd/util/cmd/read-badger/cmd/blocks.go (2 hunks)
cmd/util/cmd/read-badger/cmd/cluster_blocks.go (1 hunks)
cmd/util/cmd/read-badger/cmd/collections.go (5 hunks)
cmd/util/cmd/read-badger/cmd/epoch_commit.go (3 hunks)
cmd/util/cmd/read-badger/cmd/epoch_protocol_state.go (3 hunks)
cmd/util/cmd/read-badger/cmd/guarantees.go (3 hunks)
cmd/util/cmd/read-badger/cmd/protocol_kvstore.go (3 hunks)
cmd/util/cmd/read-badger/cmd/seals.go (4 hunks)
cmd/util/cmd/read-badger/cmd/transaction_results.go (2 hunks)
cmd/util/cmd/read-badger/cmd/transactions.go (3 hunks)
cmd/util/cmd/read-light-block/read_light_block_test.go (1 hunks)
cmd/util/cmd/read-protocol-state/cmd/blocks.go (2 hunks)
cmd/util/cmd/read-protocol-state/cmd/snapshot.go (2 hunks)
cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height.go (3 hunks)
cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height_test.go (2 hunks)
cmd/util/cmd/snapshot/cmd.go (2 hunks)
cmd/util/cmd/verify-evm-offchain-replay/verify.go (2 hunks)
consensus/integration/nodes_test.go (1 hunks)
consensus/recovery/protocol/state_test.go (1 hunks)
engine/access/access_test.go (4 hunks)
engine/access/ingestion/collections/indexer_test.go (1 hunks)
engine/access/rpc/backend/transactions/transactions_functional_test.go (1 hunks)
engine/collection/compliance/engine_test.go (2 hunks)
engine/collection/epochmgr/engine.go (1 hunks)
engine/collection/epochmgr/engine_test.go (3 hunks)
engine/collection/epochmgr/factories/cluster_state.go (2 hunks)
engine/collection/epochmgr/factories/epoch.go (3 hunks)
engine/collection/epochmgr/factory.go (2 hunks)
engine/collection/epochmgr/mock/epoch_components_factory.go (3 hunks)
engine/collection/message_hub/message_hub_test.go (1 hunks)
engine/collection/test/cluster_switchover_test.go (1 hunks)
engine/common/follower/integration_test.go (1 hunks)
engine/execution/pruner/core_test.go (1 hunks)
engine/testutil/nodes.go (2 hunks)
engine/verification/verifier/verifiers.go (1 hunks)
integration/testnet/container.go (1 hunks)
integration/tests/access/cohort4/access_test.go (1 hunks)
integration/tests/access/cohort4/execution_data_pruning_test.go (1 hunks)
model/flow/chain.go (1 hunks)
module/block_iterator/iterator_test.go (2 hunks)
module/builder/collection/builder.go (4 hunks)
module/builder/collection/builder_test.go (29 hunks)
module/executiondatasync/optimistic_sync/pipeline/pipeline_functional_test.go (1 hunks)
module/finalizedreader/finalizedreader_test.go (2 hunks)
module/finalizer/collection/finalizer_test.go (2 hunks)
module/finalizer/consensus/finalizer_test.go (3 hunks)
network/channels/errors_test.go (1 hunks)
state/cluster/badger/mutator.go (9 hunks)
state/cluster/badger/mutator_test.go (2 hunks)
state/cluster/badger/snapshot_test.go (2 hunks)
state/cluster/root_block.go (1 hunks)
state/protocol/badger/mutator_test.go (4 hunks)
state/protocol/badger/state.go (1 hunks)
state/protocol/badger/state_test.go (3 hunks)
state/protocol/util/testing.go (9 hunks)
storage/badger/all.go (1 hunks)
storage/errors.go (1 hunks)
storage/headers.go (2 hunks)
storage/operation/cluster.go (2 hunks)
storage/operation/cluster_test.go (5 hunks)
storage/operation/headers.go (1 hunks)
storage/store/blocks.go (9 hunks)
storage/store/blocks_test.go (7 hunks)
storage/store/cluster_blocks_test.go (1 hunks)
storage/store/guarantees_test.go (3 hunks)
storage/store/headers.go (8 hunks)
storage/store/headers_test.go (7 hunks)
storage/store/init.go (2 hunks)
storage/store/payloads_test.go (2 hunks)
utils/unittest/cluster_block.go (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (69)

module/finalizer/collection/finalizer_test.go (3)

utils/unittest/block.go (1)

BlockFixture (14-21)

utils/unittest/locks.go (1)

WithLock (15-26)

storage/locks.go (1)

LockInsertBlock (14-14)

engine/access/rpc/backend/transactions/transactions_functional_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

cmd/util/cmd/find-inconsistent-result/cmd.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

cmd/util/cmd/common/storage.go (4)

model/flow/chain.go (1)

ChainID (14-14)

storage/store/init.go (2)

All (9-29)

InitAll (34-76)

module/metrics/noop.go (1)

NoopCollector (22-22)

storage/badger/all.go (1)

InitAll (14-53)

cmd/util/cmd/verify-evm-offchain-replay/verify.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

engine/verification/verifier/verifiers.go (1)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

cmd/util/cmd/export-json-transactions/cmd.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

cmd/util/cmd/read-protocol-state/cmd/snapshot.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

engine/collection/message_hub/message_hub_test.go (2)

state/cluster/root_block.go (1)

CanonicalClusterID (18-20)

utils/unittest/fixtures.go (1)

IdentifierListFixture (1143-1149)

admin/commands/storage/read_range_cluster_blocks.go (4)

storage/store/cluster_payloads.go (1)

ClusterPayloads (16-19)

storage/store/headers.go (1)

NewClusterHeaders (62-85)

module/metrics/noop.go (1)

NoopCollector (22-22)

model/flow/chain.go (1)

ChainID (14-14)

engine/collection/compliance/engine_test.go (2)

state/cluster/root_block.go (1)

CanonicalClusterID (18-20)

utils/unittest/fixtures.go (1)

IdentifierListFixture (1143-1149)

engine/collection/epochmgr/engine.go (1)

model/flow/chain.go (1)

ChainID (14-14)

engine/execution/pruner/core_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

cmd/util/cmd/read-protocol-state/cmd/blocks.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

storage/operation/cluster.go (1)

storage/operation/headers.go (1)

InsertClusterHeader (57-73)

integration/tests/access/cohort4/execution_data_pruning_test.go (2)

storage/store/headers.go (1)

NewHeaders (33-57)

model/flow/chain.go (1)

Localnet (35-35)

cmd/util/cmd/read-badger/cmd/transaction_results.go (3)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

storage/store/transaction_results.go (1)

NewTransactionResults (62-130)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

module/executiondatasync/optimistic_sync/pipeline/pipeline_functional_test.go (1)

model/flow/chain.go (1)

ChainID (14-14)

utils/unittest/cluster_block.go (3)

model/flow/chain.go (1)

ChainID (14-14)

state/cluster/root_block.go (1)

CanonicalClusterID (18-20)

utils/unittest/fixtures.go (1)

IdentifierListFixture (1143-1149)

storage/badger/all.go (3)

storage/store/init.go (2)

InitAll (34-76)

All (9-29)

model/flow/chain.go (1)

ChainID (14-14)

storage/store/headers.go (1)

NewHeaders (33-57)

consensus/integration/nodes_test.go (2)

storage/store/headers.go (1)

NewHeaders (33-57)

model/flow/chain.go (1)

ChainID (14-14)

storage/store/headers_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/store/headers.go (3)

NewHeaders (33-57)

Headers (18-26)

NewClusterHeaders (62-85)

cmd/util/cmd/read-badger/cmd/cluster_blocks.go (2)

storage/store/headers.go (1)

NewClusterHeaders (62-85)

storage/store/cluster_payloads.go (1)

NewClusterPayloads (23-37)

storage/operation/cluster_test.go (3)

model/flow/chain.go (1)

ChainID (14-14)

state/cluster/root_block.go (1)

CanonicalClusterID (18-20)

utils/unittest/fixtures.go (1)

IdentifierListFixture (1143-1149)

engine/access/access_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

engine/access/ingestion/collections/indexer_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

state/protocol/util/testing.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

module/block_iterator/iterator_test.go (2)

model/flow/chain.go (1)

ChainID (14-14)

storage/store/headers.go (1)

NewHeaders (33-57)

cmd/util/cmd/snapshot/cmd.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

engine/common/follower/integration_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

cmd/util/cmd/read-badger/cmd/protocol_kvstore.go (1)

module/metrics/noop.go (1)

NoopCollector (22-22)

engine/collection/epochmgr/factory.go (1)

model/flow/chain.go (1)

ChainID (14-14)

state/protocol/badger/state_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

cmd/util/cmd/read-badger/cmd/guarantees.go (1)

module/metrics/noop.go (1)

NoopCollector (22-22)

storage/store/blocks_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

cmd/util/cmd/export-json-transactions/transactions/range_test.go (2)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

model/flow/chain.go (1)

ChainID (14-14)

state/protocol/badger/mutator_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

cmd/collection/main.go (1)

admin/commands/storage/read_range_cluster_blocks.go (1)

NewReadRangeClusterBlocksCommand (29-34)

cmd/util/cmd/read-badger/cmd/seals.go (1)

module/metrics/noop.go (1)

NoopCollector (22-22)

cmd/util/cmd/exec-data-json-export/result_exporter.go (3)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

module/metrics/noop.go (1)

NoopCollector (22-22)

storage/store/headers.go (1)

NewHeaders (33-57)

cmd/util/cmd/read-badger/cmd/blocks.go (2)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

storage/store/headers.go (1)

NewHeaders (33-57)

state/cluster/badger/snapshot_test.go (5)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

storage/store/cluster_payloads.go (1)

NewClusterPayloads (23-37)

storage/store/headers.go (2)

NewClusterHeaders (62-85)

Headers (18-26)

state/cluster/badger/mutator.go (1)

NewMutableState (34-44)

integration/testnet/container.go (2)

storage/store/headers.go (1)

NewHeaders (33-57)

model/flow/chain.go (1)

ChainID (14-14)

module/finalizedreader/finalizedreader_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

engine/testutil/nodes.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

storage/store/cluster_blocks_test.go (1)

model/flow/chain.go (1)

ChainID (14-14)

consensus/recovery/protocol/state_test.go (1)

storage/store/headers.go (1)

NewHeaders (33-57)

engine/collection/epochmgr/factories/cluster_state.go (4)

model/flow/chain.go (1)

ChainID (14-14)

state/cluster/badger/mutator.go (2)

MutableState (23-30)

NewMutableState (34-44)

storage/store/headers.go (3)

Headers (18-26)

NewClusterHeaders (62-85)

NewHeaders (33-57)

storage/store/cluster_payloads.go (2)

ClusterPayloads (16-19)

NewClusterPayloads (23-37)

storage/store/init.go (4)

storage/badger/all.go (1)

InitAll (14-53)

module/metrics.go (1)

CacheMetrics (509-518)

model/flow/chain.go (1)

ChainID (14-14)

storage/store/headers.go (1)

NewHeaders (33-57)

storage/store/payloads_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

storage/operation/headers.go (5)

storage/locks.go (2)

LockInsertBlock (14-14)

LockInsertOrFinalizeClusterBlock (22-22)

storage/operation/prefix.go (1)

MakePrefix (134-146)

storage/operation/reads.go (1)

KeyExists (112-128)

storage/errors.go (1)

ErrAlreadyExists (22-22)

storage/operations.go (1)

Writer (107-126)

cmd/util/cmd/read-badger/cmd/collections.go (1)

module/metrics/noop.go (1)

NoopCollector (22-22)

cmd/util/cmd/exec-data-json-export/event_exporter.go (3)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

module/metrics/noop.go (1)

NoopCollector (22-22)

storage/store/headers.go (1)

NewHeaders (33-57)

cmd/scaffold.go (5)

state/protocol/badger/state.go (2)

IsBootstrapped (980-990)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/node_builder.go (1)

BaseConfig (140-190)

integration/localnet/builder/bootstrap.go (1)

BootstrapDir (29-29)

model/flow/chain.go (1)

ChainID (14-14)

storage/store/headers.go (1)

NewHeaders (33-57)

cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height.go (4)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

cmd/util/cmd/common/storage.go (1)

InitStorages (61-64)

storage/store/my_receipts.go (1)

NewMyExecutionReceipts (29-56)

storage/store/headers.go (1)

Headers (18-26)

cmd/util/cmd/exec-data-json-export/delta_snapshot_exporter.go (3)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

module/metrics/noop.go (1)

NoopCollector (22-22)

storage/store/headers.go (1)

NewHeaders (33-57)

module/builder/collection/builder_test.go (4)

storage/store/headers.go (2)

Headers (18-26)

NewClusterHeaders (62-85)

storage/store/cluster_payloads.go (2)

ClusterPayloads (16-19)

NewClusterPayloads (23-37)

storage/store/init.go (1)

InitAll (34-76)

state/cluster/badger/mutator.go (1)

NewMutableState (34-44)

cmd/util/cmd/read-badger/cmd/transactions.go (1)

module/metrics/noop.go (1)

NoopCollector (22-22)

state/cluster/badger/mutator.go (2)

storage/store/cluster_payloads.go (1)

ClusterPayloads (16-19)

state/cluster/badger/state.go (1)

State (20-24)

storage/store/guarantees_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

engine/collection/epochmgr/engine_test.go (2)

model/flow/chain.go (1)

ChainID (14-14)

state/protocol/mock/params.go (1)

NewParams (171-181)

module/builder/collection/builder.go (3)

storage/operation/heights.go (1)

RetrieveEpochFirstHeight (65-67)

storage/errors.go (1)

ErrNotFound (17-17)

model/flow/constants.go (1)

DefaultTransactionExpiry (22-22)

cmd/util/cmd/exec-data-json-export/block_exporter.go (3)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

module/metrics/noop.go (1)

NoopCollector (22-22)

storage/store/headers.go (1)

NewHeaders (33-57)

cmd/util/cmd/read-light-block/read_light_block_test.go (2)

storage/store/headers.go (1)

NewClusterHeaders (62-85)

model/flow/chain.go (1)

ChainID (14-14)

state/cluster/badger/mutator_test.go (3)

storage/store/init.go (1)

InitAll (34-76)

storage/store/headers.go (2)

NewClusterHeaders (62-85)

Headers (18-26)

state/cluster/badger/mutator.go (1)

NewMutableState (34-44)

storage/store/headers.go (5)

storage/headers.go (1)

Headers (15-63)

state/cluster/root_block.go (1)

IsCanonicalClusterID (26-28)

storage/errors.go (1)

ErrWrongChain (37-37)

storage/locks.go (2)

LockInsertBlock (14-14)

LockInsertOrFinalizeClusterBlock (22-22)

storage/operation/headers.go (1)

InsertHeader (26-42)

engine/collection/epochmgr/factories/epoch.go (1)

model/flow/chain.go (1)

ChainID (14-14)

cmd/util/cmd/exec-data-json-export/transaction_exporter.go (3)

state/protocol/badger/state.go (1)

GetChainIDFromLatestFinalizedHeader (996-1002)

storage/store/index.go (1)

NewIndex (19-39)

storage/store/headers.go (1)

NewHeaders (33-57)

cmd/util/cmd/rollback-executed-height/cmd/rollback_executed_height_test.go (2)

storage/store/init.go (1)

InitAll (34-76)

storage/badger/all.go (1)

InitAll (14-53)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)

GitHub Check: Lint (./insecure/)
GitHub Check: Lint (./)
GitHub Check: Lint (./integration/)
GitHub Check: Integration Tests Others (integration)
GitHub Check: Unit Tests (network/p2p/connection)
GitHub Check: Unit Tests (network)
GitHub Check: Unit Tests (module/dkg)
GitHub Check: Unit Tests (network/alsp)
GitHub Check: Unit Tests (engine/access)
GitHub Check: Unit Tests (network/test/cohort2)
GitHub Check: Unit Tests (utils)
GitHub Check: Unit Tests (engine/common)
GitHub Check: Unit Tests (engine/execution/computation)
GitHub Check: Unit Tests (engine/consensus)
GitHub Check: Unit Tests (network/p2p/node)
GitHub Check: Unit Tests (engine/collection)
GitHub Check: Unit Tests (engine)
GitHub Check: Unit Tests (engine/verification)
GitHub Check: Unit Tests (module)
GitHub Check: Unit Tests (network/test/cohort1)
GitHub Check: Unit Tests (network/p2p/scoring)
GitHub Check: Unit Tests (others)
GitHub Check: Unit Tests (engine/execution/ingestion)
GitHub Check: Unit Tests (engine/execution)
GitHub Check: Unit Tests (network/p2p)
GitHub Check: Unit Tests (fvm)
GitHub Check: Unit Tests (storage)
GitHub Check: Unit Tests (state)
GitHub Check: Unit Tests (admin)
GitHub Check: Unit Tests (ledger)
GitHub Check: Unit Tests (consensus)
GitHub Check: Unit Tests (cmd)
GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
GitHub Check: Unit Tests Insecure (insecure)
GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
GitHub Check: Docker Build
GitHub Check: analyze-code (go)

coderabbitai · 2025-12-20T01:25:50Z

admin/commands/storage/read_range_cluster_blocks.go

+	clusterHeaders := store.NewClusterHeaders(&metrics.NoopCollector{}, c.db, flow.ChainID(chainID))
 	clusterBlocks := store.NewClusterBlocks(
-		c.db, flow.ChainID(chainID), c.headers, c.payloads,
+		c.db, flow.ChainID(chainID), clusterHeaders, c.payloads,
 	)


⚠️ Potential issue | 🟡 Minor

Handle potential panic from invalid chain ID.

store.NewClusterHeaders panics if the provided chainID is not a canonical cluster chain ID (per the relevant code snippet from storage/store/headers.go:61-84). Since chainID comes from user input at line 37, an invalid chain ID would cause a panic instead of returning a user-friendly error.

🔎 Suggested validation before creating headers

+import "github.com/onflow/flow-go/state/cluster" + func (c *ReadRangeClusterBlocksCommand) Handler(ctx context.Context, req *admin.CommandRequest) (interface{}, error) { chainID, err := parseString(req, "chain-id") if err != nil { return nil, err } + if !cluster.IsCanonicalClusterID(flow.ChainID(chainID)) { + return nil, admin.NewInvalidAdminReqErrorf("invalid cluster chain ID: %s", chainID) + } reqData, err := parseHeightRangeRequestData(req)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In admin/commands/storage/read_range_cluster_blocks.go around lines 53 to 56, creating clusterHeaders with store.NewClusterHeaders can panic for a non-canonical chainID coming from user input; validate the chainID before calling NewClusterHeaders (e.g., call the canonical/validation routine provided by the Flow/store packages or write a simple check that rejects non-cluster canonical IDs) and return a clear user-facing error if invalid, only constructing clusterHeaders/clusterBlocks after the chainID passes validation so the code never triggers the panic.

tim-barry added 8 commits December 1, 2025 14:53

add ChainID parameter to Header storage

f3c7603

update cluster mutator/snapshot tests

0fe4007

update header generation in cluster builder tests

467b732

update mock usage in epochmgr tests

585c0e9

enable TestExtend_WithReferenceBlockFromClusterChain

6d9c389

fix FinalizedAncestryLookup during cluster switchover

927b229

Weakens the chainID requirement for cluster chains when reading from storage.

Use appropriate height index for header storage

81ffdeb

Merge branch 'master' into tim/4204-split-header-storage-by-chainid

99a2498

tim-barry requested review from durkmurder and jordanschalm December 4, 2025 00:08

tim-barry and others added 4 commits December 4, 2025 11:45

Merge branch 'master' into tim/4204-split-header-storage-by-chainid

6253765

introduce sentinel error for incorrect header chain

a22fd0b

update default ChainID for cluster block fixture in tests

ee80525

update tests

2ecfee1

tim-barry marked this pull request as ready for review December 8, 2025 18:03

tim-barry requested a review from a team as a code owner December 8, 2025 18:03

durkmurder reviewed Dec 9, 2025

View reviewed changes

storage/store/headers.go Outdated Show resolved Hide resolved

jordanschalm reviewed Dec 10, 2025

View reviewed changes

tim-barry added 6 commits December 12, 2025 11:57

move IsClusterChain to a method on ChainID

6fe94ec

Add NewClusterHeaders constructor and make chainID checks more explicit

2fbe0b5

add chain-specific lock checks for header insertion

f16ccbf

update determineChainID() and document expected errors

c24be76

Fix bug in populateFinalizedAncestryLookup and remove workaround

e397124

The check was unintentionally crossing an epoch boundary and retrieving headers from a previous cluster chain. In addition, `ctx.refEpochFirstHeight` was never initialized. See #8222 (comment) for details

clarify chainID is for consensus

f53d361

durkmurder reviewed Dec 15, 2025

View reviewed changes

cmd/scaffold.go Outdated Show resolved Hide resolved

tim-barry added 2 commits December 15, 2025 09:22

address some TODOs for initializing storage during util commands

a93fcfa

remove completed TODO

62d9ee0

tim-barry added 2 commits December 15, 2025 15:48

use only necessary storage interfaces in read-badger commands

6134b48

tim-barry requested review from AlexHentschel and durkmurder December 16, 2025 17:53

AlexHentschel reviewed Dec 17, 2025

View reviewed changes

AlexHentschel and others added 8 commits December 17, 2025 17:06

fixed typos

c1af7a8

added deprecation notice

9e506b1

fix reference block in collection finalizer test

789ab94

Add operation.InsertClusterHeader

3c57826

This distinction allows more granularity with which locks are required. Also similarly split up definition of the storeWithLock functor used by Header storage.

update required lock for operation.InsertHeader

0e0e37a

Consistently use valid/canonical cluster chainIDs in tests

a6cc833

Merge branch 'master' into tim/4204-split-header-storage-by-chainid

c0bdbac

AlexHentschel reviewed Dec 18, 2025

View reviewed changes

tim-barry commented Dec 18, 2025

View reviewed changes

module/builder/collection/builder_test.go Outdated Show resolved Hide resolved

tim-barry and others added 3 commits December 18, 2025 14:55

Apply suggestions from code review

a6a98a6

Co-authored-by: Alexander Hentschel <[email protected]>

fix lint

7737e6b

rename to clusterHeaders where appropriate

d2f2846

tim-barry force-pushed the tim/4204-split-header-storage-by-chainid branch from b5ccec2 to d2f2846 Compare December 18, 2025 23:57

tim-barry added 2 commits December 18, 2025 16:02

continue rename of clusterHeaders vars/fields

8c2f842

remove unused field in collection builder test

d0dd8fd

AlexHentschel reviewed Dec 19, 2025

View reviewed changes

tim-barry added 3 commits December 19, 2025 14:36

update doc comments for Headers storage

8ffd89a

add some tests for storing/retrieving on headers with wrong chain

3b53515

update error returns for storage.Blocks

6613eda

coderabbitai bot reviewed Dec 20, 2025

View reviewed changes

		// It supports storing, caching and retrieving by block ID, and additionally an index by header height.
		func NewClusterHeaders(collector module.CacheMetrics, db storage.DB, chainID flow.ChainID) *Headers {

	// ByBlockID returns the header with the given ID. It is available for finalized blocks and those pending finalization.
	// Error returns:
	// - [storage.ErrNotFound] if no block header with the given ID exists
	// - [storage.ErrWrongChain] if the block header exists in the database but is part of a different chain than expected
	func (h Headers) ByBlockID(blockID flow.Identifier) (flow.Header, error) {
	return h.retrieveTx(blockID)
	}

	// Exists returns true if a header with the given ID has been stored.
	// No errors are expected during normal operation.
	Exists(blockID flow.Identifier) (bool, error)

	// ByParentID finds all children for the given parent block. The returned headers
	// might be unfinalized; if there is more than one, at least one of them has to
	// be unfinalized.
	// CAUTION: this method is not backed by a cache and therefore comparatively slow!
	//
	// Expected error returns during normal operations:
	// - [storage.ErrNotFound] if no block with the given parentID is known
	ByParentID(parentID flow.Identifier) ([]*flow.Header, error)

	// ProposalByBlockID returns the header with the given ID, along with the corresponding proposer signature.
	// It is available for finalized blocks and those pending finalization.
	// Error returns:
	// - [storage.ErrNotFound] if no block header or proposer signature with the given blockID exists
	ProposalByBlockID(blockID flow.Identifier) (*flow.ProposalHeader, error)

	// Headers represents persistent storage for blocks.
	type Headers interface {


		var _ storage.Headers = (*Headers)(nil)

		// NewHeaders creates a Headers instance, which stores block headers.

-	// If we don't have the epoch boundaries (first/final height ON MAIN CHAIN) cached, try retrieve and cache them
+	// We can't specify the height of the epoch's first consensus block (height ON MAIN CHAIN) during which this cluster is
+	// active, because the builder is typically _instantiated_ before the epoch starts. However, the builder should only be
+	// called once the epoch has started, i.e. consensus has finalized the first block in the epoch. Consequently, we
+	// retrieve the epoch's first height on the first call of the builder, and cache it for future calls.

-func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) {
+// findRefHeightSearchRangeForConflictingClusterBlocks computes the range of reference block heights of ancestor blocks
+// which could possibly contain transactions duplicating those in our collection under construction, based on the range
+// of reference heights of transactions in the collection under construction.
+// Input range is the (inclusive) range of reference heights of transactions eligible for inclusion in the collection
+// under construction. Output range is the (inclusive) range of reference heights which need to be searched in order to
+// avoid transaction repeats.
+//
+// Within a single epoch, we have argued that for a set of transactions, with `minRefHeight` (`maxRefHeight`) being
+// the smallest (largest) reference block height, we only need to inspect collections with reference block heights
+// c ∈ (minRefHeight-E, maxRefHeight]. Note that the lower bound is exclusive, while the upper bound is inclusive,
+// which we transform to an inclusive range:
+//
+//	   c ∈ (minRefHeight-E, maxRefHeight]
+//	⇔  c ∈ [minRefHeight-E+1, maxRefHeight]
+//
+// In order to take epoch boundaries into account, we note: A collector cluster is only responsible for transactions whose
+// reference blocks are within the cluster's operating epoch. Thus, we can bound the lower end of the search range by the
+// height of the first block in the epoch. Formally, we only need to inspect collections with reference block height
+//
+//	c ∈ [max{minRefHeight-E+1, epochFirstHeight}, maxRefHeight]
+func findRefHeightSearchRangeForConflictingClusterBlocks(minRefHeight, maxRefHeight uint64, ctx *blockBuildContext) (start, end uint64) {
+	// in order to avoid underflow, we rewrite the lower-bound equation entirely without subtraction:
+	//     max{minRefHeight-E+1, epochFirstHeight} == epochFirstHeight
+	//  ⇔  minRefHeight - E + 1 ≤ epochFirstHeight
+	//  ⇔      minRefHeight - E < epochFirstHeight
+	//  ⇔          minRefHeight < epochFirstHeight + E
+	if minRefHeight < ctx.refEpochFirstHeight+flow.DefaultTransactionExpiry {
+		return ctx.refEpochFirstHeight, maxRefHeight
+	}
+	// We reach the following line only if minRefHeight-E+1 > epochFirstHeight ≥ 0. Hence, an underflow is impossible.
+	return minRefHeight + 1 - flow.DefaultTransactionExpiry, maxRefHeight
+}

Differentiate between Consensus and Cluster Headers storage #8222

Are you sure you want to change the base?

Differentiate between Consensus and Cluster Headers storage #8222

Uh oh!

Conversation

tim-barry commented Dec 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

codecov-commenter commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tim-barry Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tim-barry Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexHentschel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

!

Related request

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexHentschel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Concerns about function GetChainIDFromLatestFinalizedHeader and GetLatestFinalizedHeader returning storage.ErrNotFound errors

Uh oh!

Uh oh!

Uh oh!

tim-barry commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

github-actions bot commented Dec 4, 2025 •

edited

Loading

codecov-commenter commented Dec 4, 2025 •

edited

Loading

tim-barry Dec 12, 2025 •

edited

Loading

tim-barry Dec 12, 2025 •

edited

Loading

AlexHentschel left a comment •

edited

Loading

Concerns about function `GetChainIDFromLatestFinalizedHeader` and `GetLatestFinalizedHeader` returning `storage.ErrNotFound` errors

coderabbitai bot commented Dec 20, 2025 •

edited

Loading