Skip to content

[stable] Fixes for bugs found in QA testing #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 30, 2025

Conversation

brenns10
Copy link
Member

Same as #157 but applies to the stable/v2.1.x branch. The bug numbers are different for tracking.

The module_has_debuginfo() function is heuristic and only checks the
first symbol to see whether drgn knows of it. Let's increase this to the
first 5 symbols (truly, just a magic number) which eliminates some
observed false positives in the wild.

I'm willing to do this somewhat scattershot approach beacuse we're not
far from switching to drgn's new Module API. This will allow us to drop
this heuristic function and just query the Module API for the status of
the debuginfo file.

Orabug: 37894843

Signed-off-by: Stephen Brennan <[email protected]>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Apr 30, 2025
@brenns10 brenns10 requested a review from biger410 April 30, 2025 16:37
@brenns10
Copy link
Member Author

All CI tests are passing.

The heuristic approach for detecting whether DWARF debuginfo is loaded
for kernel modules proved to be even less reliable than previously
expected. For instance, on some kernel versions, over half of kernel
modules are wrongly detected as "missing" debuginfo.

To avoid this causing problems in corelens, let's add the less elegant,
but obvious solution. Keep track of which modules we've loaded
debuginfo, and consult this set first. Fall back to heuristic detection
in case we did not record that we had loaded debuginfo. This is
unlikely, given that the common use cases of drgn-tools (corelens and
the CLI) both rely on the APIs here to load debuginfo.

Orabug: 37894843

Signed-off-by: Stephen Brennan <[email protected]>
While we do have logic to catch ctrl-c and broken pipe errors, the logic
doesn't apply to /usr/bin/corelens, due to the way the script is created
by pip/setuptools. Fix this, and while we're at it, de-mystify a comment
I wrote regarding the corelens startup sequence.

Orabug: 37879205

Signed-off-by: Stephen Brennan <[email protected]>
In 60c868eff2bc5 ("arm64/cpufeature: Store elf_hwcaps as a bitmap rather
than unsigned long") the elf_hwcap variable on aarch64 was converted to
a bitmap. Update the cpuinfo module to handle this.

Orabug: 37879692

Signed-off-by: Stephen Brennan <[email protected]>
The current code implicitly assumes that the system is using rds_rdma.
If the module runs on a system with rds_tcp loaded, we get a traceback
ending in the following error:

  KeyError: 'rds_ib_devices'

I don't know whether rds_tcp support is desired at this time, but at a
minimum, corelens should not throw a traceback. So, explicitly require
rds_rdma for this module. Now, for rds_tcp, we get the following
standard corelens warning that explains the situation.

  warning: rds skipped because 'rds, rds_rdma' was (were) not (all) loaded in the kernel

Orabug: 37894843

Signed-off-by: Stephen Brennan <[email protected]>
Ah, the classic Python experience: deprecations for seemingly no reason.

Signed-off-by: Stephen Brennan <[email protected]>
@brenns10 brenns10 merged commit 0637077 into oracle-samples:stable/v2.1.x Apr 30, 2025
4 checks passed
@brenns10 brenns10 deleted the qa_bugs_stable branch April 30, 2025 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants