Skip to content

Timeout sync loop #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 16, 2025
Merged

Timeout sync loop #264

merged 11 commits into from
Jun 16, 2025

Conversation

n8maninger
Copy link
Member

Adds a maximum amount of time we will sync with a single peer before trying another one. This is a hot fix for an issue seen on Mainnet where a single peer dominates the single threaded sync loop preventing peers from staying synced.

Copilot

This comment was marked as outdated.

Copy link
Member

@peterjan peterjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Big change ... I remember Luke very purposefully restricting peersForSync to a handful of peers. I love it though, we should go for the robust option more often in these type of spots.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a hotfix to prevent a single peer from dominating the sync loop by adding a timeout on peer sync operations and refactoring peer iteration and logging. Key changes include refactoring the peer iteration to use new “maps” and “slices” APIs, updating the sync loop logic with a timeout for V2 block syncing, and revising logging and asynchronous relaying in the RPC handler.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
syncer/syncer.go Refactored peer iteration and added a timeout to limit sync duration with a single peer.
syncer/peer.go Updated logging via a local logger variable and changed relay call to asynchronous execution.
syncer/bootstrap.go Updated the list of bootstrap peers.
.changeset/fixed_an_issue_with_a_single_syncing_peer_from_taking_over_the_sync_loop.md Added a changeset note summarizing the hotfix.
Comments suppressed due to low confidence (2)

syncer/syncer.go:671

  • The new loop in the syncLoop replacing the sleep function is hard to follow and might not accurately preserve the intended sync interval behavior. Consider restructuring the loop to clearly separate the waiting mechanism from the execution of sync operations.
for { select { case <-ctx.Done(): return nil // note: context cancelled is not an error case <-ticker.C: }

syncer/peer.go:466

  • Changing the relayV2BlockOutline call to run asynchronously could introduce race conditions if the function accesses shared state. Verify that all shared resources handled within relayV2BlockOutline are properly synchronized.
go s.relayV2BlockOutline(r.Block, origin) // non-blocking

@n8maninger n8maninger merged commit e162bc8 into master Jun 16, 2025
11 checks passed
@n8maninger n8maninger deleted the nate/sync-loop-timeout branch June 16, 2025 16:13
@github-project-automation github-project-automation bot moved this from In Progress to Done in Sia Jun 16, 2025
@peterjan peterjan mentioned this pull request Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants