Skip to content

Conversation

@EgeCaner
Copy link
Contributor

@EgeCaner EgeCaner commented Jan 8, 2026

Due to a bug, only the first L1Handler transaction in a block was written to MsgHash to L2TxnHash mapping, resulting in consecutive ones to be missing, leading to to error in starknet_getMessageStatus.

Includes migration

@EgeCaner EgeCaner added the disable-deploy-test We don't want to run deploy tests with this PR because it might affect our development environment. label Jan 8, 2026
@EgeCaner EgeCaner force-pushed the fix/L1-handler-message-hash-to-txn-hash-mapping branch from 9a9b02a to c92ac35 Compare January 8, 2026 13:35
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 73.17073% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.32%. Comparing base (2335b43) to head (535f9d3).

Files with missing lines Patch % Lines
migration/migration.go 73.52% 5 Missing and 4 partials ⚠️
core/accessors.go 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3367      +/-   ##
==========================================
- Coverage   76.33%   76.32%   -0.02%     
==========================================
  Files         351      351              
  Lines       33309    33349      +40     
==========================================
+ Hits        25427    25452      +25     
- Misses       6070     6080      +10     
- Partials     1812     1817       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@EgeCaner EgeCaner force-pushed the fix/L1-handler-message-hash-to-txn-hash-mapping branch from 2d955bb to 8c1faac Compare January 8, 2026 17:30
@EgeCaner EgeCaner force-pushed the fix/L1-handler-message-hash-to-txn-hash-mapping branch from 8c1faac to 535f9d3 Compare January 8, 2026 17:42
@thiagodeev thiagodeev self-requested a review January 9, 2026 17:04
Copy link
Contributor

@thiagodeev thiagodeev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

By generating a benchmark with cursor, it seems the approach with a buffered channel is at least 15% faster. It's a one-line change, WDYT about it?

Image

Comment on lines +1030 to +1033
var writeMu sync.Mutex
numWorkers := runtime.GOMAXPROCS(0)
workerPool := pool.New().WithErrors().WithMaxGoroutines(numWorkers)
batch := database.NewBatch()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does the migration take. Is it super quick, no need, but if not, what do you think of adding some logging to inform users. I leave it completely at your discretion.

Also, is this operation light enough to work under the 8gb of ram and 4 core minimum requirements of Juno? I think yes, but asking just to be safe. An intuitive answer is good enough in this case in my opinion – let's not put more time into this PR than strictly required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration was taking ~1 min 15 sec in my machine, let me run it in environment similar to minimum requirements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think how long it takes also matters because we're writing everything in a single batch. If user kills the process in the middle, all progress will be lost, and user has to suffer the same cost of migrating them again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EgeCaner Is it sepolia or mainnet?

Copy link
Contributor Author

@EgeCaner EgeCaner Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure if we should allow to cancel. In case of cancel and re-run we would do the all operations again, migration wont start from where it was cancelled. To be able to remain from where we left, we need to track additional state, by taking a look at the table we are writing L1MessageHash to L2 txn hash we cannot tell were we left?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it sepolia or mainnet?

It was Sepolia

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add some logs as well, on Mainnet it will take lot more than on sepolia

blockNumbers := make(chan uint64)
go func() {
for blockNum := range chainHeight + 1 {
blockNumbers <- blockNum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should select context here, in case user want to cancel.

// recalculateL1HandlerMsgHashes recalculates L1Handler message hash to txn hash mappings.
// Needed because calculateL1MsgHashes2 ran with a buggy WriteL1HandlerMsgHashes.
// Functionally same as calculateL1MsgHashes2, but optimised for concurrent reads.
type recalculateL1HandlerMsgHashesToTxnHashes struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is it possible to move this into a separate file or package? I think this file is getting too long now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@EgeCaner
Copy link
Contributor Author

LGTM.

By generating a benchmark with cursor, it seems the approach with a buffered channel is at least 15% faster. It's a one-line change, WDYT about it?
Image

I tried the buffered channel but it didnt provided meaningful or any visible difference, time spent on channel is dwarfed by the other heavy operations DB read/write

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable-deploy-test We don't want to run deploy tests with this PR because it might affect our development environment.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants