Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(workers): Don't panic when a worker's parent thread stops running #12156

Merged
merged 2 commits into from
Sep 22, 2021

Conversation

andreubotella
Copy link
Contributor

@andreubotella andreubotella commented Sep 20, 2021

This panic could happen in the following cases:

  • A non-fatal error being thrown from a worker, that doesn't terminate the worker's execution, but propagates to the main thread without being handled, and makes the main thread terminate.
  • A nested worker being alive while its parent worker gets terminated.
  • A race condition if the main event loop terminates the worker as part of its last task, but the worker doesn't fully terminate before the main event loop stops running.

This panic happens because a worker's event loop should have pending ops as long as the worker isn't closed or terminated – but if an event loop finishes running while it has living workers, its associated WorkerThread structs will be dropped, closing the channels that keep those ops pending.

This change adds a Drop implementation to WorkerThread, which terminates the worker without waiting for a response. This fixes the panic, and makes it so nested workers are automatically terminated once any of their ancestors is closed or terminated.

This change also refactors a worker's termination code into a WorkerThread::terminate() method.

Closes #11342.

@andreubotella
Copy link
Contributor Author

andreubotella commented Sep 21, 2021

As it turns out, terminating the worker when WebWorkerHandle gets dropped doesn't work, because WebWorkerHandle can be cloned without creating a new worker (which is used by the worker_host ops in order to keep access to the handle across await points). But other than those ops, the only thing that uses a WebWorkerHandle is WorkerThread, which does seem to map one-to-one with a worker. Making that struct, rather than WebWorkerHandle, terminate the worker on drop seems like the right choice.

(I'll be updating the title of this PR and the commit message once CI passes.)

This panic could happen in the following cases:

- A non-fatal error being thrown from a worker, that doesn't terminate
  the worker's execution, but propagates to the main thread without
  being handled, and makes the main thread terminate.
- A nested worker being alive while its parent worker gets terminated.
- A race condition if the main event loop terminates the worker as part
  of its last task, but the worker doesn't fully terminate before the
  main event loop stops running.

This panic happens because a worker's event loop should have pending ops
as long as the worker isn't closed or terminated – but if an event loop
finishes running while it has living workers, its associated
`WorkerThread` structs will be dropped, closing the channels that keep
those ops pending.

This change adds a `Drop` implementation to `WorkerThread`, which
terminates the worker without waiting for a response. This fixes the
panic, and makes it so nested workers are automatically terminated once
any of their ancestors is closed or terminated.

This change also refactors a worker's termination code into a
`WorkerThread::terminate()` method.

Closes denoland#11342.
@andreubotella andreubotella force-pushed the worker-drop-handle-race branch from 6cd79db to e20dd13 Compare September 21, 2021 13:46
@andreubotella andreubotella changed the title fix(workers): Dropping a WebWorkerHandle should terminate the worker fix(workers): Don't panic when a worker's parent thread stops running Sep 21, 2021
Copy link
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's try this

Comment on lines +87 to +88
.expect("Worker thread panicked")
.expect("Panic in worker event loop");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should be expect()ing here. This would cause another panic if something goes wrong in worker and might be prone to other race conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although that doesn't change current behavior so I guess it's fine for now

@bartlomieju bartlomieju merged commit 5c5f4ea into denoland:main Sep 22, 2021
@andreubotella andreubotella deleted the worker-drop-handle-race branch September 22, 2021 16:02
andreubotella pushed a commit to andreubotella/deno that referenced this pull request Sep 24, 2021
Before denoland#12156, closing a worker which had children would cause a panic
(denoland#11342 (comment)).
After that PR, closing a worker will also close any child workers.
piscisaureus pushed a commit that referenced this pull request Sep 25, 2021
…12215)

Before #12156, closing a worker which had children would cause a panic
(#11342 (comment)).

After that PR, closing a worker will also close any child workers.
caspervonb pushed a commit to caspervonb/deno that referenced this pull request Sep 26, 2021
…denoland#12156)

This panic could happen in the following cases:

- A non-fatal error being thrown from a worker, that doesn't terminate
  the worker's execution, but propagates to the main thread without
  being handled, and makes the main thread terminate.
- A nested worker being alive while its parent worker gets terminated.
- A race condition if the main event loop terminates the worker as part
  of its last task, but the worker doesn't fully terminate before the
  main event loop stops running.

This panic happens because a worker's event loop should have pending ops
as long as the worker isn't closed or terminated – but if an event loop
finishes running while it has living workers, its associated
`WorkerThread` structs will be dropped, closing the channels that keep
those ops pending.

This change adds a `Drop` implementation to `WorkerThread`, which
terminates the worker without waiting for a response. This fixes the
panic, and makes it so nested workers are automatically terminated once
any of their ancestors is closed or terminated.

This change also refactors a worker's termination code into a
`WorkerThread::terminate()` method.

Closes denoland#11342.

Co-authored-by: Bartek Iwańczuk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exceptions thrown after a worker's initial execution panic
2 participants