netkvm: Make NIC hotplug memory leak test configurable #4426

heywji · 2025-12-17T05:26:27Z

Refactor the test to use a percentage-based memory leak threshold instead of a fixed value, making it more reliable for VMs with different memory sizes. The wait times for hotplug and unplug events are also now configurable parameters.

ID: 1480
Signed-off-by: Wenkang Ji [email protected]

Summary by CodeRabbit

Release Notes

New Features
- Added configurable sleep timing parameters for network interface hotplug and hotunplug operations.
- Implemented per-operating system memory leak detection thresholds using percentage-based evaluation metrics.
- Enhanced memory leak reporting to include both absolute and relative measurements for improved analysis.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Refactor the test to use a percentage-based memory leak threshold instead of a fixed value, making it more reliable for VMs with different memory sizes. The wait times for hotplug and unplug events are also now configurable parameters. Signed-off-by: Wenkang Ji <[email protected]>

coderabbitai · 2025-12-17T05:26:42Z

Walkthrough

The changes configure memory leak detection thresholds and timing parameters for NIC hotplug testing. The configuration file introduces hotplug and hotunplug sleep durations set to 3 seconds each, along with per-OS memory leak thresholds via a variants block (1.0% for Windows, 0.2% for Linux). The test script replaces fixed 3-second sleep durations with configurable parameters and implements percentage-based memory leak evaluation, computing the ratio of memory reduced to free memory before hotplug and comparing against configurable thresholds instead of hard-coded logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Percentage-based calculation logic: Verify that the mem_reduced_percent formula (mem_reduced / free_mem_before_nichotplug) * 100 correctly represents the intended metric and handles edge cases (e.g., zero free memory).
Parameter parsing and defaults: Ensure hotplug_sleep and hotunplug_sleep parameters are correctly parsed from configuration and have appropriate fallback values if missing.
Per-OS threshold application: Confirm that the variants in the configuration file correctly apply OS-specific thresholds (1.0% vs. 0.2%) and that the Python code reads these values appropriately.
Error messaging accuracy: Review updated error messages and error_context outputs to ensure they accurately reflect both absolute memory values (in KB) and percentage metrics with threshold comparisons.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: making the NIC hotplug memory leak test configurable through percentage-based thresholds and configurable wait times.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

heywji · 2025-12-17T05:28:57Z

@yanglei-rh Hi Lei, could you please help review this patch? Especially the "mem_leak_threshold_percent" for your Linux side, if it makes sense.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

qemu/tests/memory_leak_after_nichotplug.py (1)
66-66: Inconsistent with PR objectives: Linux path still uses hardcoded sleep.

The PR description states "Makes wait times for hotplug and unplug events configurable parameters," but the Linux code path (line 66) still uses a hardcoded time.sleep(3). This is inconsistent with the Windows implementation (lines 43-44, 54, 56) and the configuration parameters defined in the cfg file.

Apply this diff to use the configurable parameter:
+    hotplug_sleep = int(params.get("hotplug_sleep", 3))
+    hotunplug_sleep = int(params.get("hotunplug_sleep", 3))
     session.cmd_output_safe("swapoff -a")
     mac = vm.get_mac_address()
     guest_nic = utils_net.get_linux_ifname(session, mac)
     for i in range(1, 300):
         session.cmd_output_safe(
             "ip link add link %s name %s.%s type vlan id %s"
             % (guest_nic, guest_nic, i, i)
         )
-    time.sleep(3)
+    time.sleep(hotplug_sleep)
     for i in range(1, 300):
         session.cmd_output_safe("ip link delete %s.%s" % (guest_nic, i))
+        time.sleep(hotunplug_sleep)

🧹 Nitpick comments (1)

qemu/tests/cfg/memory_leak_after_nichotplug.cfg (1)
6-8: Clarify comment to cover both parameters.

The comment mentions "device initialization" but hotunplug_sleep also needs clarification. Consider specifying that both parameters are in seconds and apply to different operations.

Apply this diff to improve clarity:
-    # Wait time for device initialization (seconds)
+    # Wait time in seconds: hotplug_sleep for device initialization, hotunplug_sleep for device removal
     hotplug_sleep = 3
     hotunplug_sleep = 3

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89b4300 and edaa6e5.

📒 Files selected for processing (2)

qemu/tests/cfg/memory_leak_after_nichotplug.cfg (1 hunks)
qemu/tests/memory_leak_after_nichotplug.py (3 hunks)

🔇 Additional comments (5)

qemu/tests/cfg/memory_leak_after_nichotplug.cfg (1)

9-14: LGTM!

The OS-specific thresholds are well-chosen, with Windows having a higher tolerance (1.0%) than Linux (0.2%), which aligns with typical OS memory management differences.

qemu/tests/memory_leak_after_nichotplug.py (4)

43-44: LGTM!

The configurable sleep parameters are correctly parsed with sensible defaults for the Windows hotplug flow.

54-56: LGTM!

The configurable sleep times are correctly applied in the Windows hotplug/unplug loop.

77-82: LGTM!

The percentage-based comparison logic and updated error message are clear and informative, providing both absolute and relative metrics for debugging.

83-86: LGTM!

The context message correctly displays both absolute and percentage values for the memory reduction.

coderabbitai · 2025-12-17T05:29:33Z

qemu/tests/memory_leak_after_nichotplug.py

+    mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100
+    threshold_percent = float(params.get("mem_leak_threshold_percent"))


⚠️ Potential issue | 🔴 Critical

Critical: Division by zero and missing default value.

Two critical issues:

Division by zero: Line 74 will raise ZeroDivisionError if free_mem_before_nichotplug is 0.

Missing default: Line 75 calls float() on params.get("mem_leak_threshold_percent") without a default value. If the parameter is missing, params.get() returns None and float(None) raises TypeError.

Apply this diff to add safeguards:

mem_reduced = free_mem_before_nichotplug - free_mem_after_nichotplug - mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100 - threshold_percent = float(params.get("mem_leak_threshold_percent")) + + if free_mem_before_nichotplug == 0: + test.error("Free memory before hotplug is 0, cannot calculate percentage") + + mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100 + threshold_percent = float(params.get("mem_leak_threshold_percent", "1.0"))

🤖 Prompt for AI Agents

In qemu/tests/memory_leak_after_nichotplug.py around lines 74-75, protect against division-by-zero and missing parameter: before computing mem_reduced_percent, check if free_mem_before_nichotplug is zero and set mem_reduced_percent to 0 (or skip/fail the check) to avoid ZeroDivisionError; when reading the threshold, use params.get("mem_leak_threshold_percent", "0") (or another sensible default string) and then cast with float(...) so float() never receives None—optionally validate the parsed float and raise a clear error if it's not numeric.

Got it, will include it together when Lei also reviews the Linux value part.

@heywji, sounds good! I'll wait for the updated changes after Lei's review. Feel free to tag me if you have any questions when implementing the fixes.

yanglei-rh · 2025-12-18T09:29:26Z

qemu/tests/memory_leak_after_nichotplug.py

        pci_model = params.get("pci_model")
        netdst = params.get("netdst", "virbr0")
        nettype = params.get("nettype", "bridge")
+        hotplug_sleep = int(params.get("hotplug_sleep", 3))


Please switch to get_numeric.

yanglei-rh · 2025-12-18T09:32:29Z

qemu/tests/memory_leak_after_nichotplug.py

                queues=params.get("queues"),
            )
-            time.sleep(3)
+            time.sleep(hotplug_sleep)


If you don't would like to using hard code, you can combine them an one parameters . While hotplug_sleep and hotunplug_sleep are clear, they are somewhat redundant.

Yes, I agree with you. We can set the same value for the hotplug and hot-unplug operation, which doesn't matter and makes the code cleaner.

yanglei-rh · 2025-12-18T09:39:29Z

Hi @heywji I don't object to you using percentages to calculate the results. But please double-check with the developers to ensure the count is correct, so we don't overlook any bug. And remember to paste the results of your discussion with the developers into the automation bug.

heywji · 2025-12-18T12:00:47Z

@yanglei-rh OK, but does the 0.2% make sense for your Linux side? Could you please help confirm that as well?

yanglei-rh · 2025-12-18T12:51:22Z

@yanglei-rh OK, but does the 0.2% make sense for your Linux side? Could you please help confirm that as well?

In fact, it not related to windows or linux guest, you should confirm it with memory developer.

coderabbitai bot reviewed Dec 17, 2025

View reviewed changes

yanglei-rh reviewed Dec 18, 2025

View reviewed changes

		mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100
		threshold_percent = float(params.get("mem_leak_threshold_percent"))

netkvm: Make NIC hotplug memory leak test configurable #4426

Are you sure you want to change the base?

netkvm: Make NIC hotplug memory leak test configurable #4426

Uh oh!

Conversation

heywji commented Dec 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

heywji commented Dec 17, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

heywji Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

yanglei-rh Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

yanglei-rh Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

heywji Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yanglei-rh commented Dec 18, 2025

Uh oh!

heywji commented Dec 18, 2025

Uh oh!

yanglei-rh commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

heywji commented Dec 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 17, 2025 •

edited

Loading

coderabbitai bot Dec 17, 2025 •

edited

Loading

heywji Dec 18, 2025 •

edited

Loading