Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cpuinfo module #4

Merged
merged 1 commit into from
Oct 27, 2023
Merged

Conversation

Hannibal404
Copy link
Member

No description provided.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Oct 20, 2023
@Hannibal404 Hannibal404 force-pushed the prahar-cpuinfo branch 2 times, most recently from b48779d to d8e1895 Compare October 21, 2023 17:42
@Hannibal404 Hannibal404 reopened this Oct 21, 2023
@Hannibal404 Hannibal404 force-pushed the prahar-cpuinfo branch 4 times, most recently from 4069f79 to fb3eba0 Compare October 25, 2023 10:58
Copy link
Member

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry I haven't gotten the chance to do a detailed review of the PR yet, but I did get started on it. Here's some comments for the first few functions. I'll send more feedback tomorrow!

drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
Comment on lines 31 to 34
if "cpu_data" in prog:
cpuinfo_struct = prog["cpu_data"]
elif "boot_cpu_data" in prog:
cpuinfo_struct = prog["boot_cpu_data"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use .read_() here (docs).

The reasoning here is that the cpuinfo struct is a large-ish structure, and we're going to be reading parts of it very frequently. The way drgn handles this (in the live case) is by calling read(/proc/kcore) for each time we access a value. And reading from /proc/kcore has a quite high overhead for each call (but the overhead for each additional byte of data read is much lower). Thus, it makes sense to read chunks of data at a time, rather than small reads. If we call .read_() like shown, then the whole structure gets read at once, and the result is saved as the value of the object.

This means that the value won't be updated to reflect changes over time. But that also is good, because the chances are much better that your read would be atomic, in the live case.

Suggested change
if "cpu_data" in prog:
cpuinfo_struct = prog["cpu_data"]
elif "boot_cpu_data" in prog:
cpuinfo_struct = prog["boot_cpu_data"]
if "cpu_data" in prog:
cpuinfo_struct = prog["cpu_data"].read_()
elif "boot_cpu_data" in prog:
cpuinfo_struct = prog["boot_cpu_data"].read_()

Copy link
Member

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, finished reviewing this. Thanks for doing all of the logic for these different CPU mitigations. I know there's a lot of it here.

drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
drgn_tools/cpuinfo.py Outdated Show resolved Hide resolved
Copy link
Member

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing here, then I think we'll be good, assuming tests pass.

drgn_tools/cpuinfo.py Show resolved Hide resolved
@brenns10
Copy link
Member

Looking good, but there's an internal CI failure:


    def get_retbleed_mitigation(prog: Program, cpuinfo: Object) -> str:
        """
        Extracts Mitigation for Retbleed
        """
        if "retbleed_state" in prog:
            if prog["retbleed_state"] == prog.constant(
                "RETBLEED_MITIGATION_UNRET"
            ):
                if (
>                   cpuinfo.x86_vendor != X86_VENDOR_AMD
                    and cpuinfo.x86_vendor != X86_VENDOR_HYGON
                ):
E               AttributeError: '_drgn.Object' object has no attribute 'x86_vendor'

Looks like get_retbleed_mitigation() expects the cpuinfo object, not the feature/bug flags bitfield. You could add that as another parameter or just look it up from prog.

Signed-off-by: Pradyumn Rahar <[email protected]>
Copy link
Member

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, thanks!

@brenns10 brenns10 merged commit 630e95e into oracle-samples:main Oct 27, 2023
5 checks passed
imran-kn added a commit that referenced this pull request Feb 8, 2024
I am seeing call stacks like ones shown below are getting
missed by scan_sem_lock.

PID: 1174   TASK: ffff953cfc723100  CPU: 2   COMMAND: sem_waiter
 #0 [ffffb902c05cbda0] __schedule at ffffffff86f7a86b
 #1 [ffffb902c05cbe28] schedule at ffffffff86f7ad9d
 #2 [ffffb902c05cbe38] schedule_timeout at ffffffff86f7d94d
 #3 [ffffb902c05cbe98] __down_killable at ffffffff86f7ccb6
 #4 [ffffb902c05cbee0] down_killable at ffffffff8668d69d

PID: 1181   TASK: ffff99c2bc6a6e40  CPU: 1   COMMAND: sem_waiter
 #0 [ffffa3780060bda0] __schedule at ffffffffbb77a86b
 #1 [ffffa3780060be28] schedule at ffffffffbb77ad9d
 #2 [ffffa3780060be38] schedule_timeout at ffffffffbb77d8df
 #3 [ffffa3780060be98] __down_timeout at ffffffffbb77cdc4
 #4 [ffffa3780060bee0] down_timeout at ffffffffbae8d740

PID: 1175   TASK: ffff96c3fc5a55c0  CPU: 1   COMMAND: sem_waiter
 #0 [ffff9f35c0543da0] __schedule at ffffffff82b7a86b
 #1 [ffff9f35c0543e28] schedule at ffffffff82b7ad9d
 #2 [ffff9f35c0543e38] schedule_timeout at ffffffff82b7d94d
 #3 [ffff9f35c0543e98] __down_interruptible at ffffffff82b7cef6
 #4 [ffff9f35c0543ee0] down_interruptible at ffffffff8228d7dd

down_interruptible and down_timeout are fairly common interfaces,
especially in device drivers.

down_killable is not that common, so can be removed if don't
deem it important enough for an extra scan.

Signed-off-by: Imran Khan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants