-
Notifications
You must be signed in to change notification settings - Fork 182
netkvm: Make NIC hotplug memory leak test configurable #4426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Refactor the test to use a percentage-based memory leak threshold instead of a fixed value, making it more reliable for VMs with different memory sizes. The wait times for hotplug and unplug events are also now configurable parameters. Signed-off-by: Wenkang Ji <[email protected]>
WalkthroughThe changes configure memory leak detection thresholds and timing parameters for NIC hotplug testing. The configuration file introduces hotplug and hotunplug sleep durations set to 3 seconds each, along with per-OS memory leak thresholds via a variants block (1.0% for Windows, 0.2% for Linux). The test script replaces fixed 3-second sleep durations with configurable parameters and implements percentage-based memory leak evaluation, computing the ratio of memory reduced to free memory before hotplug and comparing against configurable thresholds instead of hard-coded logic. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
@yanglei-rh Hi Lei, could you please help review this patch? Especially the "mem_leak_threshold_percent" for your Linux side, if it makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
qemu/tests/memory_leak_after_nichotplug.py (1)
66-66: Inconsistent with PR objectives: Linux path still uses hardcoded sleep.The PR description states "Makes wait times for hotplug and unplug events configurable parameters," but the Linux code path (line 66) still uses a hardcoded
time.sleep(3). This is inconsistent with the Windows implementation (lines 43-44, 54, 56) and the configuration parameters defined in the cfg file.Apply this diff to use the configurable parameter:
+ hotplug_sleep = int(params.get("hotplug_sleep", 3)) + hotunplug_sleep = int(params.get("hotunplug_sleep", 3)) session.cmd_output_safe("swapoff -a") mac = vm.get_mac_address() guest_nic = utils_net.get_linux_ifname(session, mac) for i in range(1, 300): session.cmd_output_safe( "ip link add link %s name %s.%s type vlan id %s" % (guest_nic, guest_nic, i, i) ) - time.sleep(3) + time.sleep(hotplug_sleep) for i in range(1, 300): session.cmd_output_safe("ip link delete %s.%s" % (guest_nic, i)) + time.sleep(hotunplug_sleep)
🧹 Nitpick comments (1)
qemu/tests/cfg/memory_leak_after_nichotplug.cfg (1)
6-8: Clarify comment to cover both parameters.The comment mentions "device initialization" but hotunplug_sleep also needs clarification. Consider specifying that both parameters are in seconds and apply to different operations.
Apply this diff to improve clarity:
- # Wait time for device initialization (seconds) + # Wait time in seconds: hotplug_sleep for device initialization, hotunplug_sleep for device removal hotplug_sleep = 3 hotunplug_sleep = 3
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
qemu/tests/cfg/memory_leak_after_nichotplug.cfg(1 hunks)qemu/tests/memory_leak_after_nichotplug.py(3 hunks)
🔇 Additional comments (5)
qemu/tests/cfg/memory_leak_after_nichotplug.cfg (1)
9-14: LGTM!The OS-specific thresholds are well-chosen, with Windows having a higher tolerance (1.0%) than Linux (0.2%), which aligns with typical OS memory management differences.
qemu/tests/memory_leak_after_nichotplug.py (4)
43-44: LGTM!The configurable sleep parameters are correctly parsed with sensible defaults for the Windows hotplug flow.
54-56: LGTM!The configurable sleep times are correctly applied in the Windows hotplug/unplug loop.
77-82: LGTM!The percentage-based comparison logic and updated error message are clear and informative, providing both absolute and relative metrics for debugging.
83-86: LGTM!The context message correctly displays both absolute and percentage values for the memory reduction.
| mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100 | ||
| threshold_percent = float(params.get("mem_leak_threshold_percent")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Division by zero and missing default value.
Two critical issues:
- Division by zero: Line 74 will raise
ZeroDivisionErroriffree_mem_before_nichotplugis 0. - Missing default: Line 75 calls
float()onparams.get("mem_leak_threshold_percent")without a default value. If the parameter is missing,params.get()returnsNoneandfloat(None)raisesTypeError.
Apply this diff to add safeguards:
mem_reduced = free_mem_before_nichotplug - free_mem_after_nichotplug
- mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100
- threshold_percent = float(params.get("mem_leak_threshold_percent"))
+
+ if free_mem_before_nichotplug == 0:
+ test.error("Free memory before hotplug is 0, cannot calculate percentage")
+
+ mem_reduced_percent = (mem_reduced / free_mem_before_nichotplug) * 100
+ threshold_percent = float(params.get("mem_leak_threshold_percent", "1.0"))🤖 Prompt for AI Agents
In qemu/tests/memory_leak_after_nichotplug.py around lines 74-75, protect
against division-by-zero and missing parameter: before computing
mem_reduced_percent, check if free_mem_before_nichotplug is zero and set
mem_reduced_percent to 0 (or skip/fail the check) to avoid ZeroDivisionError;
when reading the threshold, use params.get("mem_leak_threshold_percent", "0")
(or another sensible default string) and then cast with float(...) so float()
never receives None—optionally validate the parsed float and raise a clear error
if it's not numeric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, will include it together when Lei also reviews the Linux value part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@heywji, sounds good! I'll wait for the updated changes after Lei's review. Feel free to tag me if you have any questions when implementing the fixes.
| pci_model = params.get("pci_model") | ||
| netdst = params.get("netdst", "virbr0") | ||
| nettype = params.get("nettype", "bridge") | ||
| hotplug_sleep = int(params.get("hotplug_sleep", 3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please switch to get_numeric.
| queues=params.get("queues"), | ||
| ) | ||
| time.sleep(3) | ||
| time.sleep(hotplug_sleep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't would like to using hard code, you can combine them an one parameters . While hotplug_sleep and hotunplug_sleep are clear, they are somewhat redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree with you. We can set the same value for the hotplug and hot-unplug operation, which doesn't matter and makes the code cleaner.
|
Hi @heywji I don't object to you using percentages to calculate the results. But please double-check with the developers to ensure the count is correct, so we don't overlook any bug. And remember to paste the results of your discussion with the developers into the automation bug. |
|
@yanglei-rh OK, but does the 0.2% make sense for your Linux side? Could you please help confirm that as well? |
In fact, it not related to windows or linux guest, you should confirm it with memory developer. |
Refactor the test to use a percentage-based memory leak threshold instead of a fixed value, making it more reliable for VMs with different memory sizes. The wait times for hotplug and unplug events are also now configurable parameters.
ID: 1480
Signed-off-by: Wenkang Ji [email protected]
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.