Skip to content

Reboot/abort improvement suggestions (double persist, missing abort, auto reboot) #406

@metalwarrior665

Description

@metalwarrior665

Issue description

Currently, Actor.reboot triggers all listeners attached to persistState and migrating events and waits for them to finish and then calls API to restart. This is generally a desired behavior. But few non-ideal points:

  1. The developer should generally call await Actor.reboot inside the Actor.on('migrating' listener. That causes SDK & Crawlee to persist everything. But SDK & Crawlee already received persistState event and are currently persisting asynchronously. So basically, there is a double persistence of everything, and with it a tiny (we are probably talking <100ms difference, so probably insignificant) chance of a race condition where the earlier state is stored later. The dev could solve this by adding sleep(2000) before calling reboot but it's extra ugly code. So this isn't anything critical to fix, just a bit weird behavior.
  2. Actor.reboot shouldn't be used for aborting event so we need a comparable solution. Generally, we should be handling migrations and aborts the same way but we (platform & tooling teams) tend to forget a bit about graceful aborting. We could export the forcePersist part of functionality for devs to use or we could add the forcePersist (optionally?) to Actor.exit so devs would use it instead of reboot.

Long term, however, I think the SDK should handle the migrations and abortions basically silently in the background. Since we have useState and on('persistState/migrating' for a long time, we should expect that all state management goes through there. So the SDK would simply reboot/exit after persisting on its own. This would have another benefit that it would finally make sense to make graceful abort the default one on platform because the SDK would exit almost immediately anyway. But since we would be removing the 30 sec hanging period, it might be better to do this as part of some large (v4?) change.

Code sample

Package version

latest

Node.js version

latest

Operating system

No response

Actor or run link

No response

I have tested this on the next release

No response

Other context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions