-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add workaround for 90 second LX systemd stop/reboot delay #1022
base: master
Are you sure you want to change the base?
Add workaround for 90 second LX systemd stop/reboot delay #1022
Conversation
On SmartOS LX, systemd uses its "legacy cgroup hierarchy" (cgroup-v1) codepath, which does not signal when the root cgroup becomes empty, causing it to wait 90 seconds before continuing to shut down. This commit modifies vmadm to send the `systemd-shutdown` process SIGCHLD every second during a `vmadm stop` or `vmadm reboot` command, which causes `systemd-shutdown` to re-check its list of remaining processes, which removes the unnecessary wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't studied vmadm(1M) much, but how are you sure that vmobj.pid is the pid for systemd-shutdown? What if you're sending SIGCHLD to something else? What about older or alternate LX images that don't have modern systemd?
Thanks for the feedback, @danmcd.
If I understand correctly, During shutdown,
Ah, good point. To reduce that risk, I could have it check the process's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my meandering into procfs.
You should test this on multiple kinds of LX Zone images. I'm happy to build a test platform image for you that has this fix in it, OR you can use lofs tricks to mount yours over the one in use on the system.
Also, and maybe @bahamat might know, there may be a corresponding code segment in VMAPI that needs to have this as well. (I can't remember if it's duplicate code or if VMAPI direclty uses files here.)
@bahamat and I had a chat offline... we're going to look a little deeper into this problem. Your analysis of how systemd-shutdown behaves is EXCELLENT, but we're wondering if this fix belongs somewhere lower in the stack, so |
@danmcd, heh, no problem. (I had first looked at I've been testing on PI 20220324T002253Z using a lofs bind mount to override
Sounds good; keep me posted. |
I'm not sure if this requires fixes in both VM.js and in zone brand scripts. This one may get delayed for a bit; I'm sorry. |
I think if zoneadm handles this properly, vmadm should be fine as-is. |
systemd-shutdown
'smain()
callsbroadcast_signal()
first with SIGTERM, then with SIGKILL.sigtimedwait()
and proceeding with the reboot without a delay.sigtimedwait()
continues until its 90 second timeout.This PR modifies vmadm to send the
systemd-shutdown
process SIGCHLD periodically during avmadm stop
orvmadm reboot
command, which causessystemd-shutdown
to re-check its list of remaining processes, thereby removing the unnecessary delay.Prior to this change,
vmadm stop
andvmadm reboot
were taking slightly more than 90 seconds. With this change, each takes about 10 seconds.See discussion and @danmcd's initial testing on https://smartos.topicbox.com/groups/smartos-discuss/T7eaf4fad91e64ad8-M2e374d3cdfd9366191bc09b7.