-
-
Notifications
You must be signed in to change notification settings - Fork 143
Open
Description
I have a machine with (less than 10) microvms on it. When the machine is rebooted daily, the log looks like:
Dec 31 09:33:22 foo systemd[1]: Starting MicroVM 'machine-d'...
Dec 31 09:33:22 foo systemd[1]: Starting MicroVM 'machine-b'...
Dec 31 09:33:22 foo systemd[1]: Starting MicroVM 'machine-e'...
Dec 31 09:33:22 foo systemd[1]: Starting MicroVM 'machine-a'...
Dec 31 09:33:22 foo systemd[1]: Starting MicroVM 'machine-f'...
Dec 31 09:33:22 foo systemd[1]: Starting MicroVM machine-c'...
Dec 31 09:34:05 foo systemd[1]: Started MicroVM 'machine-a'.
Dec 31 09:34:52 foo systemd[1]: Started MicroVM machine-c'.
Dec 31 09:34:53 foo systemd[1]: [email protected]: Failed with result 'timeout'.
Dec 31 09:34:53 foo systemd[1]: Failed to start MicroVM 'machine-b'.
Dec 31 09:34:53 foo systemd[1]: [email protected]: Failed with result 'timeout'.
Dec 31 09:34:53 foo systemd[1]: Failed to start MicroVM 'machine-d'.
Dec 31 09:34:53 foo systemd[1]: [email protected]: Failed with result 'timeout'.
Dec 31 09:34:53 foo systemd[1]: Failed to start MicroVM 'machine-f'.
Dec 31 09:34:53 foo systemd[1]: [email protected]: Failed with result 'timeout'.
Dec 31 09:34:53 foo systemd[1]: Failed to start MicroVM 'machine-e'.
Dec 31 09:35:00 foo systemd[1]: Starting MicroVM 'machine-b'...
Dec 31 09:35:00 foo systemd[1]: Starting MicroVM 'machine-f'...
Dec 31 09:35:00 foo systemd[1]: Starting MicroVM 'machine-d'...
Dec 31 09:35:00 foo systemd[1]: Starting MicroVM 'machine-e'...
Dec 31 09:35:44 foo systemd[1]: Started MicroVM 'machine-e'.
Dec 31 09:35:44 foo systemd[1]: Started MicroVM 'machine-f'.
Dec 31 09:35:50 foo systemd[1]: Started MicroVM 'machine-b'.
Dec 31 09:34:53 foo systemd[1]: [email protected]: Failed with result 'timeout'.
Dec 31 09:36:31 foo systemd[1]: Failed to start MicroVM 'machine-d'.
Dec 31 09:36:39 foo systemd[1]: Starting MicroVM 'machine-d'...
Dec 31 09:37:48 foo systemd[1]: Started MicroVM 'machine-d'.
My first read of what appears to be happening is that the machines take a fair amount of CPU time before they are considered 'up' to systemd, and if they don't become up within the default time limit of 1m30s, they are terminated and restarted. Eventually this process settles down.
What doesn't quite add up for me is that I think there should be enough CPU available for this to all happen.
Some questions:
- When is a microvm considered started?
- How might I determine why they are not coming up quickly?
- Has anyone else encountered this, are there common pitfalls?
- Should the default start timeout be raised?
- If it turns out the machines are contending for resources (such as I/O) is there a straightforward way to stagger their boot to reduce the contention overhead?
Metadata
Metadata
Assignees
Labels
No labels