Skip to content
This repository was archived by the owner on Jul 1, 2021. It is now read-only.

Conversation

cburgdorf
Copy link
Contributor

@cburgdorf cburgdorf commented Jun 16, 2020

What was wrong?

As explained in #1788 our BaseHeaderChainSyncer currently fails to follow new blocks. It catches up to the tip and then stalls forever.

How was it fixed?

It looks like this is a case of https://github.com/ethereum/snake-charmers-tactical-manual/pull/70/files (cc @gsalgado ) and I was able to get it to work with this fix.

However, this is breaking one of our tests and I'm beginning to think the proper fix would be to break out of this yield await when the SkeletonSyncer was canceled.

async def next_skeleton_segment(self) -> AsyncIterator[Tuple[BlockHeader, ...]]:
while self.manager.is_running:
yield await self._fetched_headers.get()
self._fetched_headers.task_done()

Trying to figure out how to do that...

To-Do

  • Clean up commit history
  • no release notes entry because the bug did not exist in the past release

Cute Animal Picture

put a cute animal picture link inside the parentheses

await asyncio.wait(
[task, syncer.manager.wait_finished()],
return_when=asyncio.FIRST_COMPLETED)
await self._full_skeleton_sync(syncer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like the anti-pattern I described in ethereum/snake-charmers-tactical-manual#70. I haven't looked into how _full_skeleton_sync() is implemented, but I imagine it will crash somehow if syncer terminates

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. I haven't fully groked yet why the code broke (could be a fall out from the migration to async-service).

What happens is that this loop of starting new skeleton syncs doesn't work anymore.

async for peer in self._tip_monitor.wait_tip_info():
try:
await self._validate_peer_is_ahead(peer)
except _PeerBehind:
self.logger.debug("At or behind peer %s, skipping skeleton sync", peer)
else:
async with self._get_skeleton_syncer(peer) as syncer:
await self._full_skeleton_sync(syncer)

Whenever the SkeletonSyncer cancels itself, we continue to hang on self._full_skeleton_sync(syncer) forever. The changes in this PR fixes it but it's probably not the right fix (hence the warnings). But I have an another idea to try out :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, warnings are gone when I add task.cancel() but now it fails one of our tests 🙈 Investigating...

@cburgdorf cburgdorf force-pushed the christoph/fix/broken-tip-monitoring branch 3 times, most recently from 57b72e6 to 301f558 Compare June 16, 2020 11:53
@carver carver mentioned this pull request Jun 16, 2020
3 tasks
@cburgdorf cburgdorf force-pushed the christoph/fix/broken-tip-monitoring branch from 301f558 to 0f19bcb Compare June 17, 2020 08:36
@cburgdorf cburgdorf marked this pull request as draft June 17, 2020 08:39
async def next_skeleton_segment(self) -> AsyncIterator[Tuple[BlockHeader, ...]]:
# @external_asyncio_api can not be used on `next_skeleton_segment` directly because
# it returns a generator. Maybe that could be fixed though?
@external_asyncio_api
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can not use @external_asyncio_api directly on async def next_skeleton_segment then this feels like abusing the decorator I guess

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed. It doesn't take too much work in async-service to expose the new API though, AFAICT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cburgdorf cburgdorf closed this Jun 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants