-
Notifications
You must be signed in to change notification settings - Fork 182
Balloon: Check WMI-Activity after restarting balloon service #4420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds Windows-specific verification for WMI Event ID 5858 in balloon tests. Two new methods, assert_no_wmi_error_5858(self, session), were added to BallooningTest and BallooningTestWin to copy and run a configurable PowerShell script on Windows guests and interpret its exit codes. balloon_service.py now calls this check after "run" balloon operations on Windows. Configuration keys (wmi_check_script, wmi_script_guest_path, wmi_5858_check_cmd) were added to cfg/balloon_service.cfg. A new PowerShell script deps/balloon/check_wmi_5858.ps1 was introduced. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
3d6fadb to
bd5f346
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
qemu/tests/balloon_check.py (1)
504-557: assert_no_wmi_error_5858 behavior and exit-code mapping look solidThe helper cleanly maps the PowerShell command’s exit codes (0/1/2/other) into “pass/fail/skip/warn” semantics, and degrades gracefully when
wmi_5858_check_cmdis missing or returns an unexpected status. This matches the cfg command and avoids breaking non-Windows or misconfigured runs.If, in the future, you want the “run after restart” scenario to also assert that
blnsvris actually running, you could treatstatus == 2as a failure (or add a separate strict mode) in that specific call site, but that’s optional and not required for this PR’s goal.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
qemu/tests/balloon_check.py(1 hunks)qemu/tests/balloon_service.py(1 hunks)qemu/tests/cfg/balloon_service.cfg(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
qemu/tests/balloon_service.py (1)
qemu/tests/balloon_check.py (2)
operate_balloon_service(560-583)assert_no_wmi_error_5858(504-557)
🔇 Additional comments (2)
qemu/tests/cfg/balloon_service.cfg (1)
72-72: WMI 5858 PowerShell probe matches helper expectationsThe PowerShell command’s exit codes (0: NONE, 1: HIT, 2: NO_PID) and log selection/filtering align cleanly with
assert_no_wmi_error_5858, and scoping it under the Windowsdisable_enable(stop/run) variant fits the “restart” use case. Looks good as-is.qemu/tests/balloon_service.py (1)
92-104: Windows-only WMI 5858 check is correctly scoped and placedGating the WMI 5858 check behind
windows_runensures it’s only invoked for Windows runs of the balloon service, and placing it afterballoon_memorykeeps the existing flow intact while adding the new validation. No issues spotted.
bd5f346 to
3c9cc08
Compare
|
Hi @xiagao , I've implemented the suggested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
qemu/tests/balloon_check.py (1)
510-511: Optional: Remove redundant OS check.This method is in the
BallooningTestWinclass, which is already Windows-specific. The OS type check here is defensive but unnecessary.Consider removing these lines:
- if self.params.get("os_type") != "windows": - return
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
qemu/tests/balloon_check.py(1 hunks)qemu/tests/balloon_service.py(1 hunks)qemu/tests/cfg/balloon_service.cfg(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- qemu/tests/cfg/balloon_service.cfg
🧰 Additional context used
🧬 Code graph analysis (1)
qemu/tests/balloon_service.py (1)
qemu/tests/balloon_check.py (3)
operate_balloon_service(545-568)balloon_memory(195-233)assert_no_wmi_error_5858(504-542)
🔇 Additional comments (5)
qemu/tests/balloon_service.py (1)
92-104: LGTM! Well-structured Windows-specific WMI event validation.The implementation correctly:
- Computes the
windows_runcondition once for reuse- Adds the WMI 5858 check only after "run" operations on Windows guests
- Provides clear error context for logging
- Integrates cleanly with the new
assert_no_wmi_error_5858methodqemu/tests/balloon_check.py (4)
534-537: Verify status 2 handling after "run" operation.Status 2 indicates the balloon service is not running, which triggers an info log and skips the check. However, this method is called immediately after a "run" operation in
balloon_service.py. If the service is not running after being started, this might indicate a failure to start or a crash.Current behavior treats this as a benign skip scenario. Consider whether this should be:
- A test failure (service should be running after "run" operation)
- A warning rather than info (unexpected but not critical)
- The current behavior (acceptable timing/race condition)
Review the test flow to determine if there are legitimate scenarios where the service might not be running immediately after the "run" operation (e.g., slow startup, timing window), or if this always indicates a problem that should fail the test.
517-519: Good defensive check for missing configuration.Explicitly failing when
wmi_5858_check_cmdis missing prevents silent test passes and ensures the check is actually performed. This addresses previous reviewer feedback about ensuring the test doesn't pass without running the validation.
526-542: Well-structured status code handling.The if/elif/else structure cleanly handles all expected status codes with appropriate actions:
- Status 0: Success (no events found)
- Status 1: Test failure (events detected)
- Status 2: Skip scenario (service not running)
- Other: Test failure (unexpected status)
This implementation addresses previous reviewer feedback to use if/elif/else and handle all cases explicitly.
520-520: Add timeout parameter to cmd_status_output call to prevent indefinite blocking.The PowerShell command execution at line 520 lacks an explicit timeout. The
cmd_status_output()method supports a timeout parameter (as shown in similar calls throughout the codebase withtimeout=360,timeout=240, etc.), but this call omits it. If the PowerShell command hangs, test execution could block indefinitely. Consider specifying a reasonable timeout like other operations in the file:status, output = session.cmd_status_output(ps_cmd, timeout=60)
qemu/tests/cfg/balloon_service.cfg
Outdated
| only Windows | ||
| repeat_times = 1 | ||
| blnsrv_operation = "stop run" | ||
| wmi_5858_check_cmd = "powershell -NoProfile -NonInteractive -ExecutionPolicy Bypass -Command \"$ErrorActionPreference='SilentlyContinue'; $log='Microsoft-Windows-WMI-Activity/Operational'; $p=(Get-Process -Name blnsvr -ErrorAction SilentlyContinue | Select-Object -First 1); if (-not $p) { Write-Output 'NO_PID'; exit 2 }; $blnPid=$p.Id; $start=$p.StartTime.ToUniversalTime(); $events = Get-WinEvent -FilterHashtable @{LogName=$log; Id=5858; StartTime=$start} -MaxEvents 50 -ErrorAction SilentlyContinue | Where-Object { $_.Level -eq 2 -and $_.Message -match ('ClientProcessId = ' + $blnPid) }; $e = $events | Select-Object -First 1; if ($null -ne $e) { Write-Output 'HIT'; ($e | Format-List -Property TimeCreated,Id,Message | Out-String).Trim() | Write-Output; exit 1 } else { Write-Output 'NONE'; exit 0 };\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @elizashurov , everything else looks good to me.
For this cmd, it is a bit longer. How about createing a new powershell script and placing it under io-github-autotest-qemu/deps directory, and scp it from host when running the test cases.
Refer to: https://github.com/autotest/tp-qemu/blob/master/qemu/tests/iperf_test.py#L138
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @xiagao , I've created a new PowerShell script (check_wmi_5858.ps1) and placed it under deps/balloon/. The script is now copied from host to guest when running the test case.
I tested the changes and everything works as expected. Attaching a screenshot from the debug.log for reference.

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @elizashurov , the codes look good to me.
|
(1/2) Host_RHEL.m10.u1.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.Win2025.x86_64.io-github-autotest-qemu.balloon_service.small_polling_interval.q35: STARTED |
3c9cc08 to
1bf60ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
qemu/tests/cfg/balloon_service.cfg (1)
72-75: Consider adding explanatory comment for WMI 5858 check.The configuration keys are clear and well-placed. However, adding a brief comment explaining what WMI Event ID 5858 represents (e.g., "WMI-Activity error that can occur when blnsvr.exe starts") would improve maintainability for future developers unfamiliar with this specific Windows event.
Suggested documentation addition
+ # WMI Event ID 5858 check - detects WMI-Activity errors from blnsvr.exe # WMI 5858 check script (copied from deps/balloon/) wmi_check_script = "check_wmi_5858.ps1" wmi_script_guest_path = "C:\check_wmi_5858.ps1" wmi_5858_check_cmd = "powershell -NoProfile -NonInteractive -ExecutionPolicy Bypass -File %s"deps/balloon/check_wmi_5858.ps1 (2)
26-29: Pattern matching could be more robust.The current pattern
'ClientProcessId = ' + $blnPiduses simple string matching which could theoretically match false positives if the PID value appears elsewhere in the event message. While this is unlikely to cause issues in practice, a more specific regex pattern would be more robust.Suggested improvement
Where-Object { $_.Level -eq 2 -and - $_.Message -match ('ClientProcessId = ' + $blnPid) + $_.Message -match ('ClientProcessId\s*=\s*' + $blnPid + '\b') }The
\bword boundary ensures the PID matches as a complete number, and\s*handles potential whitespace variations.
7-7: Redundant error action preference settings.The script sets
$ErrorActionPreference = 'SilentlyContinue'globally (Line 7), but then explicitly uses-ErrorAction SilentlyContinueon specific cmdlets (Lines 11, 25). While this works correctly, the explicit parameters are redundant given the global setting.You can either:
- Keep the global setting and remove the explicit
-ErrorActionparameters, or- Remove the global setting and keep only the explicit parameters where needed
Option 2 is generally preferred as it's more explicit about which operations should silently continue.
Also applies to: 11-11, 25-25
qemu/tests/balloon_check.py (2)
519-525: Consider adding error handling for file operations.The file copy operation (
vm.copy_files_to) could fail if the source script doesn't exist or if there are permission/connection issues. While the test framework may handle exceptions at a higher level, adding explicit error handling would provide clearer error messages for debugging.Suggested error handling
# Copy WMI 5858 check script from host to guest balloon_deps_dir = data_dir.get_deps_dir("balloon") wmi_script_name = self.params.get("wmi_check_script") wmi_script_host = os.path.join(balloon_deps_dir, wmi_script_name) + + if not os.path.exists(wmi_script_host): + self.test.error( + "WMI check script not found at: %s" % wmi_script_host + ) + wmi_script_guest_path = self.params.get("wmi_script_guest_path") - self.vm.copy_files_to(wmi_script_host, wmi_script_guest_path) + try: + self.vm.copy_files_to(wmi_script_host, wmi_script_guest_path) + except Exception as e: + self.test.error( + "Failed to copy WMI check script to guest: %s" % e + ) self.test.log.info("Copied WMI check script to guest: %s", wmi_script_guest_path)
505-551: Consider validating session state.The method accepts a
sessionparameter but doesn't verify it's valid before executing commands. While the calling code likely ensures a valid session, adding a basic check would make the method more defensive.Suggested session validation
def assert_no_wmi_error_5858(self, session): """ Ensure there are no WMI-Activity Event ID 5858 entries emitted by the balloon service since it started. This checks the 'Microsoft-Windows-WMI-Activity/Operational' log. """ if self.params.get("os_type") != "windows": return + + if not session or not session.is_responsive(): + self.test.error("Session is not valid or responsive for WMI check") + log_name = "Microsoft-Windows-WMI-Activity/Operational"
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deps/balloon/check_wmi_5858.ps1qemu/tests/balloon_check.pyqemu/tests/balloon_service.pyqemu/tests/cfg/balloon_service.cfg
🚧 Files skipped from review as they are similar to previous changes (1)
- qemu/tests/balloon_service.py
🔇 Additional comments (2)
qemu/tests/balloon_check.py (2)
1-1: LGTM - Appropriate imports added.The added imports (
osfor path operations anddata_dirfor accessing the deps directory) are correctly placed and necessary for the new WMI check functionality.Also applies to: 8-8
535-551: Exit code handling is well-structured and correct.The if/elif/else structure properly handles all expected exit codes from the PowerShell script:
- Status 0: No WMI 5858 events (success)
- Status 1: WMI 5858 detected (test failure)
- Status 2: Service not running (informational skip)
- Other: Unexpected error (test failure)
This matches the exit codes documented in the PowerShell script and follows the structure suggested in past reviews.
Note: The method name
assert_no_wmi_error_5858suggests it always asserts/fails, but status 2 logs and continues without failing. This is reasonable behavior (can't check events if service isn't running), but the naming could be slightly misleading. However, given the context and that tests have already passed, this is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
qemu/tests/balloon_check.py (2)
511-512: Redundant OS check in Windows-specific class.This guard is unnecessary since
assert_no_wmi_error_5858is defined inBallooningTestWin, which is only instantiated for Windows guests. The check provides defensive safety but adds dead code in normal usage.
521-530: Consider validating required configuration parameters.The configuration keys (
wmi_check_script,wmi_script_guest_path,wmi_5858_check_cmd) are defined inballoon_service.cfg, but if this method is invoked without proper configuration,params.get()returnsNone, causingos.path.joinor string formatting to fail with unclear errors.Consider using
params["key"](bracket notation) for required parameters to raise a clearKeyError, or add explicit validation with a descriptive error message.🔎 Proposed fix using bracket notation for required params
- wmi_script_name = self.params.get("wmi_check_script") + wmi_script_name = self.params["wmi_check_script"] wmi_script_host = os.path.join(balloon_deps_dir, wmi_script_name) - wmi_script_guest_path = self.params.get("wmi_script_guest_path") + wmi_script_guest_path = self.params["wmi_script_guest_path"] self.vm.copy_files_to(wmi_script_host, wmi_script_guest_path) self.test.log.info( "Copied WMI check script to guest: %s", wmi_script_guest_path ) # Execute the script - ps_cmd = self.params.get("wmi_5858_check_cmd") % wmi_script_guest_path + ps_cmd = self.params["wmi_5858_check_cmd"] % wmi_script_guest_path
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deps/balloon/check_wmi_5858.ps1qemu/tests/balloon_check.pyqemu/tests/balloon_service.pyqemu/tests/cfg/balloon_service.cfg
🚧 Files skipped from review as they are similar to previous changes (2)
- deps/balloon/check_wmi_5858.ps1
- qemu/tests/cfg/balloon_service.cfg
🧰 Additional context used
🧬 Code graph analysis (1)
qemu/tests/balloon_service.py (1)
qemu/tests/balloon_check.py (3)
operate_balloon_service(556-579)balloon_memory(196-234)assert_no_wmi_error_5858(505-553)
🔇 Additional comments (3)
qemu/tests/balloon_service.py (1)
92-104: LGTM! Clean integration of WMI 5858 verification.The conditional check is properly scoped to Windows guests with the "run" operation, and the placement after
balloon_memoryensures the service has fully restarted before verifying the event log.qemu/tests/balloon_check.py (2)
1-8: LGTM! New imports forosanddata_dirare necessary for the file operations in the new WMI check method.
537-553: LGTM! Comprehensive exit code handling.All exit codes are properly handled: success (0), failure (1), service not running (2), and unexpected statuses with
test.fail(). This addresses the past review feedback requesting proper handling of all return values.
Check for WMI-Activity Event ID 5858 since blnsvr.exe start. Filter by ClientProcessId. Use PS exit codes: 0=NONE, 1=HIT, 2=NO_PID. Command configured in cfg. id: 1315 Signed-off-by: Elizabeth Ashurov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
qemu/tests/balloon_check.py (1)
519-527: Consider adding parameter validation for required configuration.The code retrieves configuration parameters (
wmi_check_script,wmi_script_guest_path) without checking if they exist. If these are missing from the test configuration, the code will fail with potentially unclear error messages (e.g.,TypeErrororAttributeError).🔎 Optional: Add parameter validation
While the test framework may ensure these parameters are present, explicit validation could improve error clarity:
# Copy WMI 5858 check script from host to guest balloon_deps_dir = data_dir.get_deps_dir("balloon") wmi_script_name = self.params.get("wmi_check_script") +if not wmi_script_name: + self.test.error("Missing required parameter: wmi_check_script") wmi_script_host = os.path.join(balloon_deps_dir, wmi_script_name) +if not os.path.exists(wmi_script_host): + self.test.error("WMI check script not found: %s" % wmi_script_host) wmi_script_guest_path = self.params.get("wmi_script_guest_path") +if not wmi_script_guest_path: + self.test.error("Missing required parameter: wmi_script_guest_path")However, if the test configuration is well-validated elsewhere, this may be unnecessary defensive coding.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deps/balloon/check_wmi_5858.ps1qemu/tests/balloon_check.pyqemu/tests/balloon_service.pyqemu/tests/cfg/balloon_service.cfg
🚧 Files skipped from review as they are similar to previous changes (1)
- qemu/tests/cfg/balloon_service.cfg
🧰 Additional context used
🧬 Code graph analysis (1)
qemu/tests/balloon_service.py (1)
qemu/tests/balloon_check.py (1)
assert_no_wmi_error_5858(505-553)
🔇 Additional comments (11)
qemu/tests/balloon_service.py (2)
92-92: LGTM: Clean flag for Windows-specific validation.The
windows_runflag correctly gates the WMI check to only Windows guests performing the "run" operation, which aligns with the PR objective of checking WMI events after service restart.
99-104: LGTM: Proper integration of WMI event validation.The WMI 5858 check is correctly positioned after the balloon memory operation and appropriately gated by the
windows_runflag. The context logging provides clear traceability for test results.deps/balloon/check_wmi_5858.ps1 (5)
1-8: LGTM: Clear documentation and appropriate error handling.The exit code documentation is clear and the
SilentlyContinuepreference is appropriate for a test validation script where certain errors (like missing process or no events) are expected outcomes rather than failures.
10-15: LGTM: Robust process detection with proper fallback.The script correctly handles the case when the balloon service process is not running, using
Select-Object -First 1to avoid ambiguity if multiple instances exist, and exits with the documented NO_PID status.
17-18: LGTM: Proper time normalization for event correlation.Converting
StartTimeto UTC ensures consistent comparison with WMI event timestamps, which are typically stored in UTC. This prevents timezone-related mismatches.
20-29: Verify the regex pattern matches actual event message format.The regex pattern
'ClientProcessId = ' + $blnPidassumes a specific format in the WMI event message. While this should work for Event ID 5858, consider verifying against actual event logs to ensure the pattern is robust across different Windows versions or message variations.If you have access to sample Event ID 5858 messages from different Windows environments, you could verify the ClientProcessId format consistency. Alternatively, consider using a more flexible pattern if the format varies.
31-40: LGTM: Clear result reporting with diagnostic details.The result handling provides useful diagnostic information (TimeCreated, Id, Message) when an error event is found, while keeping the output simple for the success case. Exit codes correctly implement the documented contract.
qemu/tests/balloon_check.py (4)
1-1: LGTM: Required import for file path operations.The
osmodule import is necessary foros.path.join()used in the new WMI check method.
8-8: LGTM: Data directory utility for dependency access.Adding
data_dirto the imports enables proper access to the balloon dependency directory for locating the WMI check script.
505-517: LGTM: Well-structured method with clear gating and documentation.The method properly gates Windows-specific logic with an early return, provides clear documentation, and sets up appropriate context logging for test traceability.
529-553: LGTM: Comprehensive exit code handling addresses previous review feedback.The if/elif/else structure properly handles all possible exit codes as requested in past reviews:
- Status 0: No events found (success)
- Status 1: Error event detected (test failure)
- Status 2: Process not running (skip with info log)
- Other: Unexpected status (test failure with diagnostics)
This provides complete coverage and clear failure reporting.
Note: The command execution via
session.cmd_status_output()doesn't specify an explicit timeout. If the PowerShell script were to hang, this could block the test. However, given the script's simple queries and the framework's likely default timeout handling, this should not be a practical concern.
|
(1/2) Host_RHEL.m10.u2.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.Win2025.x86_64.io-github-autotest-qemu.balloon_service.small_polling_interval.q35: STARTED LGTM. |
|
@vivianQizhu Call for your review, thanks. |
Check for WMI-Activity Event ID 5858 since blnsvr.exe start.
id: 1315
Summary by CodeRabbit
Bug Fixes
New Features
✏️ Tip: You can customize this high-level summary in your review settings.