-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(server): Terminate on service panic #4249
base: master
Are you sure you want to change the base?
Conversation
if let Err(e) = res { | ||
if e.is_panic() { | ||
// Re-trigger panic to terminate the process: | ||
std::panic::resume_unwind(e.into_panic()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed that this terminates the process by inserting an explicit panic!()
into one of the service bodies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mid-/Long-Term it would be nice if a service can specify it's desired behaviour, abort on panic/restart on panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my understanding is correct, this essentially separates the concern of how a service is spawned out of the trait by replacing the spawn_handler
method, which contained a tokio::spawn
call, with a run
method which is just an async function. Then you can start the service using either a ServiceRunner
which keeps tabs on the services under its control—specifically, whether they panic—or just plain tokio::spawn
(in the form of the start_detached
method).
My one gripe with this is one of uniformity. In many places, services are started with runner.start(service)
, but e.g. ProjectCacheService::start
takes the runner as a parameter. It might be nicer to do that everywhere, i.e. have a start
method on Service
which takes a runner as a parameter, but it's ultimately a sort of aesthetic concern.
relay-system/src/service.rs
Outdated
pub fn start<S: Service>(&mut self, service: S) -> Addr<S::Interface> { | ||
let (addr, rx) = channel(S::name()); | ||
self.spawn(service, rx); | ||
addr | ||
} | ||
|
||
/// Starts a service and starts tracking its join handle, given a predefined receiver. | ||
pub fn spawn<S: Service>(&mut self, service: S, rx: Receiver<S::Interface>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit leery of the naming, start
vs spawn
isn't a very clear distinction. Can't think of anything better off the top of my head either, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about start
and start_with
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO something like start_with_rx
or start_with_receiver
would be even better.
Co-authored-by: Sebastian Zivota <[email protected]>
Yes!
@loewenheim this bugs me too, I will unify it to * |
After trying this out, I figured it's actually better to keep the interface of classDiagram
class ServiceRunner {
+start(service)
}
ServiceRunner --> "starts" Service
<<Interface>> Service
class Service {
async +run()
}
class ServiceWithStartHelper {
+start_in(runner)
}
Service "implements" <|-- ServiceWithStartHelper
ServiceWithStartHelper --> "calls" ServiceRunner
|
This reverts commit d0b4af9.
/// | ||
/// Exposes information about crashed services. | ||
#[derive(Debug, Default)] | ||
pub struct ServiceRunner(FuturesUnordered<JoinHandle<()>>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tokio has a JoinSet
for awaiting/spawning tasks, can use that instead of FuturesUnordered
.
if let Err(e) = res { | ||
if e.is_panic() { | ||
// Re-trigger panic to terminate the process: | ||
std::panic::resume_unwind(e.into_panic()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mid-/Long-Term it would be nice if a service can specify it's desired behaviour, abort on panic/restart on panic.
Terminate the process when a panic occurs in one of the services.
Instead of implementing a sync
fn spawn_handler()
interface, each service now implements anasync fn run()
interface. This simplifies service implementations in most cases (except where services need to spawn multiple tasks), and allows us to join on all service main tasks to detect a panic.The PR introduces a
ServiceRunner
utility to easily start & join on running services.See also #4026 for an earlier attempt.
Note: The issue requires reporting "unhealthy" instead of terminating. I will take care of that in a follow-up PR.
ref: #4037