platform/qemu: detect if QEMU process exits unexpectedly #3869

jlebon · 2024-09-03T15:49:31Z

Currently, we only try to detect if the QEMU process exited by actually wait()ing for it in the kola qemuexec path. We should do it in the kola testing path as well so that it's easy to tell if e.g. it was killed while the test was running.

Note this doesn't actually stop the test early if QEMU exited. That would require some tricky wiring into the harness. But at least what it prints helps diagnose the issue when we see the test time out on SSH. And the QEMU process won't just hang there as defunct.

dustymabe · 2024-09-03T19:28:40Z

I observe this properly in a test pipeline run:

[2024-09-03T19:14:08.996Z] 2024-09-03T19:14:06Z platform/machine/qemu: QEMU process finished abnormally: signal: killed

dustymabe · 2024-09-03T19:29:36Z

Note this doesn't actually stop the test early if QEMU exited. That would require some tricky wiring into the harness. But at least what it prints helps diagnose the issue when we see the test time out on SSH. And the QEMU process won't just hang there as defunct.

I wonder if we should add this as a comment in the code somewhere (i.e. explaining the more ideal future state).

dustymabe · 2024-09-03T19:30:30Z

I'm guessing we don't need this code from the other day then?

$ git diff mantle/platform/qemu.go
diff --git a/mantle/platform/qemu.go b/mantle/platform/qemu.go
index 6fe76da9f..5106b42b5 100644
--- a/mantle/platform/qemu.go
+++ b/mantle/platform/qemu.go
@@ -208,7 +208,9 @@ func (inst *QemuInstance) SSHAddress() (string, error) {
 
 // Wait for the qemu process to exit
 func (inst *QemuInstance) Wait() error {
-       return inst.qemu.Wait()
+       r := inst.qemu.Wait()
+       plog.Debugf("Waited for qemu. r=%v", r)
+       return r
 }
 
 // WaitIgnitionError will only return if the instance

Currently, we only try to detect if the QEMU process exited by actually `wait()`ing for it in the `kola qemuexec` path. We should do it in the kola testing path as well so that it's easy to tell if e.g. it was killed while the test was running. Note this doesn't actually stop the test early if QEMU exited. That would require some tricky wiring into the harness. But at least what it prints helps diagnose the issue when we see the test time out on SSH. And the QEMU process won't just hang there as defunct.

jlebon · 2024-09-04T15:01:09Z

I'm guessing we don't need this code from the other day then?

Yeah, we don't need it anymore.

dustymabe

LGTM

jlebon force-pushed the pr/qemu-exit branch from 67bfb04 to 32ac1f4 Compare September 4, 2024 15:00

dustymabe approved these changes Sep 4, 2024

View reviewed changes

dustymabe enabled auto-merge (rebase) September 4, 2024 16:15

dustymabe merged commit 30480db into coreos:main Sep 4, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

platform/qemu: detect if QEMU process exits unexpectedly #3869

platform/qemu: detect if QEMU process exits unexpectedly #3869

Uh oh!

jlebon commented Sep 3, 2024

Uh oh!

dustymabe commented Sep 3, 2024

Uh oh!

dustymabe commented Sep 3, 2024

Uh oh!

dustymabe commented Sep 3, 2024

Uh oh!

jlebon commented Sep 4, 2024

Uh oh!

dustymabe left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

platform/qemu: detect if QEMU process exits unexpectedly #3869

platform/qemu: detect if QEMU process exits unexpectedly #3869

Uh oh!

Conversation

jlebon commented Sep 3, 2024

Uh oh!

dustymabe commented Sep 3, 2024

Uh oh!

dustymabe commented Sep 3, 2024

Uh oh!

dustymabe commented Sep 3, 2024

Uh oh!

jlebon commented Sep 4, 2024

Uh oh!

dustymabe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants