memory.devices: Fix teardown timeout by removing kernel param before XML restore #6760

BulaYoungR · 2026-01-11T12:19:43Z

Reorder teardown operations to use current working VM config instead of backup config when removing kernel parameters. Add serial console fallback if SSH login fails.

Couldn't reproduce the failure locally but adding these changes proactively to prevent potential timeout issues in CI.

Summary by CodeRabbit

Tests
- Enhanced teardown procedures with more robust VM lifecycle management including destruction and restart handling
- Introduced configurable login timeout with automatic fallback to serial console when SSH authentication fails
- Improved session management to ensure consistent boot parameter updates during test cleanup

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Reorder teardown operations to use current working VM config instead of backup config when removing kernel parameters. Add serial console fallback if SSH login fails.

coderabbitai · 2026-01-11T12:19:55Z

Walkthrough

This change modifies the teardown method in a test file that handles virtio memory device size requests. The teardown logic now includes enhanced VM cleanup when kernel parameters are removed: the VM is destroyed if alive and restarted, followed by an attempted SSH login with a configurable timeout that falls back to serial console login on failure. Once a session is established, boot options are updated with the removed kernel parameter, the session closes, and final VM cleanup occurs. A login_timeout parameter (default 360 seconds) was added to control login attempts, and the sync operation was repositioned to the end of teardown.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly summarizes the main change: reordering teardown to remove kernel parameters before XML restoration to fix timeout issues.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

BulaYoungR · 2026-01-11T12:23:21Z

Before fix

2025-10-08 04:26:30,686 avocado.app human            L0129 DEBUG|  (1/1) type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest:
2025-10-08 04:26:30,686 avocado.app human            L0136 DEBUG| INTERRUPTED: Test interrupted: Timeout reached (1198.87 s)
2025-10-08 04:26:30,687 avocado.job testlogs         L0138 INFO | type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest: INTERRUPTED
2025-10-08 04:26:30,687 avocado.job testlogs         L0139 INFO | More information in /var/log/avocado/job-results/job-2025-10-08T04.06-1c08d65/test-results/1-type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest
2025-10-08 04:26:30,729 avocado.job job              L0659 INFO | Test results available in /var/log/avocado/job-results/job-2025-10-08T04.06-1c08d65
2025-10-08 04:26:30,729 avocado.app human            L0143 INFO |
2025-10-08 04:26:30,729 avocado.app human            L0146 INFO | RESULTS    : PASS 0 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 1 | CANCEL 0

After fix

JOB ID     : 476f4df2d90707cb90c4f6b61b395cc1da18bdbf
JOB LOG    : /var/log/avocado/job-results/job-2026-01-11T07.11-476f4df/job.log
 (1/1) type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest: STARTED
 (1/1) type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest: PASS (170.89 s)
RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /var/log/avocado/job-results/job-2026-01-11T07.11-476f4df/results.html
JOB TIME   : 173.19 s

BulaYoungR · 2026-01-11T12:23:24Z

Before fix

2025-10-08 04:26:30,686 avocado.app human            L0129 DEBUG|  (1/1) type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest:
2025-10-08 04:26:30,686 avocado.app human            L0136 DEBUG| INTERRUPTED: Test interrupted: Timeout reached (1198.87 s)
2025-10-08 04:26:30,687 avocado.job testlogs         L0138 INFO | type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest: INTERRUPTED
2025-10-08 04:26:30,687 avocado.job testlogs         L0139 INFO | More information in /var/log/avocado/job-results/job-2025-10-08T04.06-1c08d65/test-results/1-type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest
2025-10-08 04:26:30,729 avocado.job job              L0659 INFO | Test results available in /var/log/avocado/job-results/job-2025-10-08T04.06-1c08d65
2025-10-08 04:26:30,729 avocado.app human            L0143 INFO |
2025-10-08 04:26:30,729 avocado.app human            L0146 INFO | RESULTS    : PASS 0 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 1 | CANCEL 0

After fix

JOB ID     : 476f4df2d90707cb90c4f6b61b395cc1da18bdbf
JOB LOG    : /var/log/avocado/job-results/job-2026-01-11T07.11-476f4df/job.log
 (1/1) type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest: STARTED
 (1/1) type_specific.io-github-autotest-libvirt.memory.devices.virtio_mem.change_memory.memfd.zero_requested.running_guest: PASS (170.89 s)
RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /var/log/avocado/job-results/job-2026-01-11T07.11-476f4df/results.html
JOB TIME   : 173.19 s

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py:
- Around line 365-377: The teardown currently catches broad Exception for
vm.wait_for_login and vm.wait_for_serial_login which can hide unexpected errors;
update the try/except to catch specific login-related exceptions if available
(e.g., remote.LoginTimeoutError, aexpect.Expect* or whatever framework exposes)
for both vm.wait_for_login and vm.wait_for_serial_login, and if you must keep a
broad except, replace test.log.error/test.log.warning calls with
test.log.exception(...) to record stack traces (ensure the exception variable
like ssh_err/serial_err is passed or included). Locate the block around
vm.wait_for_login and vm.wait_for_serial_login and apply these changes
consistently so both failure paths log full traceback or narrow-catch the proper
exception types.
- Around line 357-387: The teardown currently can replace an original test
exception and skip bkxml.sync() because it raises from the nested login except
and doesn't guarantee session.close()/vm.destroy() run; modify the
kernel_params_remove block to capture any exception from the main test, then
perform login attempts (vm.wait_for_login / vm.wait_for_serial_login) without
re-raising so they don't mask the original error, and wrap update_boot_option,
session.close(), and vm.destroy() in a finally that always runs; ensure
bkxml.sync() is called after that finally (or call bkxml.sync() in the outer
finally) and reference functions/variables kernel_params_remove, vm.is_alive(),
vm.destroy(), vm.start(), vm.wait_for_login(), vm.wait_for_serial_login(),
utils_test.update_boot_option(), session.close(), and bkxml.sync() when applying
the changes.

🧹 Nitpick comments (2)

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py (2)
409-409: Harden login_timeout parsing to avoid ValueError from malformed params

int(params.get("login_timeout", 360)) will throw if login_timeout is "", "None", etc., which can break teardown unexpectedly.
Proposed tweak
-    login_timeout = int(params.get("login_timeout", 360))
+    try:
+        login_timeout = int(params.get("login_timeout", 360))
+    except (TypeError, ValueError):
+        login_timeout = 360
360-362: Double-destroy in teardown may be redundant (and add time)

You explicitly vm.destroy() before starting, and again at the end. Given the framework commonly handles VM cleanup in teardown paths already (per retrieved learnings), consider whether the extra destroy(s) are necessary here, especially since this change is trying to reduce timeout risk. Based on learnings, cleanup is typically centralized in teardown helpers.

Also applies to: 384-386

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f35e2b and 7511ac5.

📒 Files selected for processing (1)

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

📚 Learning: 2025-09-24T08:01:27.899Z

Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py

🧬 Code graph analysis (1)

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py (1)

libvirt/tests/src/serial/serial_functional.py (1)

wait_for_login (235-258)

🪛 Ruff (0.14.10)

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py

369-369: Do not catch blind exception: Exception

(BLE001)

375-376: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Python 3.11
GitHub Check: Python 3.12
GitHub Check: Python 3.8
GitHub Check: Python 3.9

coderabbitai · 2026-01-11T12:24:20Z

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py

        if kernel_params_remove:
-            if not vm.is_alive():
-                vm.start()
+            test.log.info("TEST_TEARDOWN: Removing kernel parameter '%s'",
+                          kernel_params_remove)
+            if vm.is_alive():
+                vm.destroy()
+            vm.start()
+
+            session = None
+            try:
+                test.log.debug("TEST_TEARDOWN: Trying SSH login (timeout=%ds)",
+                               login_timeout)
+                session = vm.wait_for_login(timeout=login_timeout)
+            except Exception as ssh_err:
+                test.log.warning("TEST_TEARDOWN: SSH login failed: %s", ssh_err)
+                test.log.debug("TEST_TEARDOWN: Trying serial console login")
+                try:
+                    session = vm.wait_for_serial_login(timeout=login_timeout)
+                except Exception as serial_err:
+                    test.log.error("TEST_TEARDOWN: Serial login also failed: %s",
+                                   serial_err)
+                    raise
+
            utils_test.update_boot_option(
-                vm, args_removed=kernel_params_remove, guest_arch_name=vm_arch_name)
+                vm, args_removed=kernel_params_remove,
+                guest_arch_name=vm_arch_name)
+            session.close()
+
+            if vm.is_alive():
+                vm.destroy()
+
+        bkxml.sync()


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find wait_for_serial_login and update_boot_option definitions echo "=== Searching for wait_for_serial_login ===" rg -n "def wait_for_serial_login" --type py echo "=== Searching for update_boot_option definition ===" rg -n "def update_boot_option" --type py echo "=== Searching for teardown_test and test structure ===" rg -n "def teardown_test|test_obj.teardown" --type py | head -20

Repository: autotest/tp-libvirt

Length of output: 2178

🏁 Script executed:

#!/bin/bash # Read the file to understand imports and structure echo "=== Reading the reviewed file ===" wc -l libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py head -80 libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py echo -e "\n=== Searching for utils_test import and usage ===" grep -n "import.*utils_test\|from.*utils_test\|utils_test\." libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py | head -10 echo -e "\n=== Searching for update_boot_option in utils modules ===" find . -path ./libvirt/tests -name "utils_test.py" 2>/dev/null | head -5 rg -l "def update_boot_option" --type py | head -10

Repository: autotest/tp-libvirt

Length of output: 2867

🏁 Script executed:

#!/bin/bash echo "=== Reading teardown_test context (lines 350-420) ===" sed -n '350,420p' libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py echo -e "\n=== Searching for wait_for_serial_login ===" rg "wait_for_serial_login" --type py -A 2 | head -30 echo -e "\n=== Looking for teardown_test() call site (finally block) ===" sed -n '390,424p' libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py

Repository: autotest/tp-libvirt

Length of output: 6266

🏁 Script executed:

#!/bin/bash echo "=== Checking if there's any test infrastructure teardown beyond local teardown_test ===" grep -n "class\|def run\|finally:" libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py | tail -20 echo -e "\n=== Full view of run_test function wrapper ===" sed -n '240,424p' libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py | grep -E "^def run_test|^ try:|^ finally:|^ teardown_test"

Repository: autotest/tp-libvirt

Length of output: 369

Teardown can suppress the original test failure and skip bkxml.sync() on login failure

If SSH and serial login both fail, the raise inside the nested try/except will propagate from the finally block, replacing any original test exception and preventing bkxml.sync() from running—leaving the VM XML unrestored. Additionally, session.close() and the final vm.destroy() are not protected by a finally block, so an exception in update_boot_option() can leak the session and leave the VM running.

Proposed fix (ensure restore and cleanup always happen, avoid masking original failures)

def teardown_test(): """ Clean data. """ test.log.info("TEST_TEARDOWN: Clean up env.") utils_memory.set_num_huge_pages(0) - if kernel_params_remove: - test.log.info("TEST_TEARDOWN: Removing kernel parameter '%s'", - kernel_params_remove) - if vm.is_alive(): - vm.destroy() - vm.start() - - session = None - try: - test.log.debug("TEST_TEARDOWN: Trying SSH login (timeout=%ds)", - login_timeout) - session = vm.wait_for_login(timeout=login_timeout) - except Exception as ssh_err: - test.log.warning("TEST_TEARDOWN: SSH login failed: %s", ssh_err) - test.log.debug("TEST_TEARDOWN: Trying serial console login") - try: - session = vm.wait_for_serial_login(timeout=login_timeout) - except Exception as serial_err: - test.log.error("TEST_TEARDOWN: Serial login also failed: %s", - serial_err) - raise - - utils_test.update_boot_option( - vm, args_removed=kernel_params_remove, - guest_arch_name=vm_arch_name) - session.close() - - if vm.is_alive(): - vm.destroy() - - bkxml.sync() + session = None + try: + if kernel_params_remove: + test.log.info("TEST_TEARDOWN: Removing kernel parameter '%s'", + kernel_params_remove) + if vm.is_alive(): + vm.destroy() + vm.start() + + try: + test.log.debug("TEST_TEARDOWN: Trying SSH login (timeout=%ds)", + login_timeout) + session = vm.wait_for_login(timeout=login_timeout) + except Exception: + test.log.exception("TEST_TEARDOWN: SSH login failed") + test.log.debug("TEST_TEARDOWN: Trying serial console login") + try: + session = vm.wait_for_serial_login(timeout=login_timeout) + except Exception: + # Don't raise from teardown; restore XML and continue cleanup. + test.log.exception("TEST_TEARDOWN: Serial login also failed") + return + + utils_test.update_boot_option( + vm, args_removed=kernel_params_remove, + guest_arch_name=vm_arch_name) + finally: + try: + if session: + session.close() + finally: + if vm.is_alive(): + vm.destroy() + bkxml.sync()

🧰 Tools

🪛 Ruff (0.14.10)

369-369: Do not catch blind exception: Exception

(BLE001)

375-376: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents

In @libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py around lines 357 - 387, The teardown currently can replace an original test exception and skip bkxml.sync() because it raises from the nested login except and doesn't guarantee session.close()/vm.destroy() run; modify the kernel_params_remove block to capture any exception from the main test, then perform login attempts (vm.wait_for_login / vm.wait_for_serial_login) without re-raising so they don't mask the original error, and wrap update_boot_option, session.close(), and vm.destroy() in a finally that always runs; ensure bkxml.sync() is called after that finally (or call bkxml.sync() in the outer finally) and reference functions/variables kernel_params_remove, vm.is_alive(), vm.destroy(), vm.start(), vm.wait_for_login(), vm.wait_for_serial_login(), utils_test.update_boot_option(), session.close(), and bkxml.sync() when applying the changes.

coderabbitai · 2026-01-11T12:24:20Z

libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py

+            try:
+                test.log.debug("TEST_TEARDOWN: Trying SSH login (timeout=%ds)",
+                               login_timeout)
+                session = vm.wait_for_login(timeout=login_timeout)
+            except Exception as ssh_err:
+                test.log.warning("TEST_TEARDOWN: SSH login failed: %s", ssh_err)
+                test.log.debug("TEST_TEARDOWN: Trying serial console login")
+                try:
+                    session = vm.wait_for_serial_login(timeout=login_timeout)
+                except Exception as serial_err:
+                    test.log.error("TEST_TEARDOWN: Serial login also failed: %s",
+                                   serial_err)
+                    raise


⚠️ Potential issue | 🟡 Minor

Avoid “blind” except Exception (or at least log stack traces consistently)

Ruff’s BLE001/TRY400 are pointing at a real maintainability issue here: catching Exception twice makes it easy to hide unexpected teardown bugs. If the framework has specific login exceptions (e.g., remote.LoginTimeoutError, aexpect.Expect*), prefer catching those; otherwise keep the broad catch but use test.log.exception(...) so you don’t lose stack traces. (The diff above already adjusts logging.)

🧰 Tools

🪛 Ruff (0.14.10)

369-369: Do not catch blind exception: Exception

(BLE001)

375-376: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents

In @libvirt/tests/src/memory/memory_devices/change_virtio_mem_request_size.py around lines 365 - 377, The teardown currently catches broad Exception for vm.wait_for_login and vm.wait_for_serial_login which can hide unexpected errors; update the try/except to catch specific login-related exceptions if available (e.g., remote.LoginTimeoutError, aexpect.Expect* or whatever framework exposes) for both vm.wait_for_login and vm.wait_for_serial_login, and if you must keep a broad except, replace test.log.error/test.log.warning calls with test.log.exception(...) to record stack traces (ensure the exception variable like ssh_err/serial_err is passed or included). Locate the block around vm.wait_for_login and vm.wait_for_serial_login and apply these changes consistently so both failure paths log full traceback or narrow-catch the proper exception types.

Fix teardown timeout by removing kernel param before XML restore

7511ac5

Reorder teardown operations to use current working VM config instead of backup config when removing kernel parameters. Add serial console fallback if SSH login fails.

BulaYoungR changed the title ~~Fix teardown timeout by removing kernel param before XML restore~~ memory.devices: Fix teardown timeout by removing kernel param before XML restore Jan 11, 2026

coderabbitai bot reviewed Jan 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

memory.devices: Fix teardown timeout by removing kernel param before XML restore #6760

memory.devices: Fix teardown timeout by removing kernel param before XML restore #6760

Uh oh!

BulaYoungR commented Jan 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 11, 2026 •

edited

Loading

Uh oh!

BulaYoungR commented Jan 11, 2026

Uh oh!

BulaYoungR commented Jan 11, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 11, 2026

Uh oh!

coderabbitai bot Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

memory.devices: Fix teardown timeout by removing kernel param before XML restore #6760

Are you sure you want to change the base?

memory.devices: Fix teardown timeout by removing kernel param before XML restore #6760

Uh oh!

Conversation

BulaYoungR commented Jan 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

BulaYoungR commented Jan 11, 2026

Uh oh!

BulaYoungR commented Jan 11, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BulaYoungR commented Jan 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 11, 2026 •

edited

Loading