Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arch/x86_64: this_task is stored in the CPU private data #15398

Merged
merged 1 commit into from
Jan 2, 2025

Conversation

xianglyc
Copy link
Contributor

@xianglyc xianglyc commented Jan 2, 2025

By default in SMP, obtaining this_task requires disabling interrupts, obtaining the current CPU index, accessing a global variable, and re-enabling interrupts. Storing this_task in percpu makes retrieval faster.

Note: Please adhere to Contributing Guidelines.

Summary

arch/x86_64: this_task is stored in the CPU private data
Update this section with information on why change is necessary,
what it exactly does and how, if new feature shows up, provide
references (dependencies, similar problems and solutions), etc.

Impact

no impact
Update this section, where applicable, on how change affects users,
build process, hardware, documentation, security, compatibility, etc.

Testing

ostest
Update this section with details on how did you verify the change,
what Host was used for build (OS, CPU, compiler, ..), what Target was
used for verification (arch, board:config, ..), etc. Providing build
and runtime logs from before and after change is highly appreciated.

By default in SMP, obtaining this_task requires disabling interrupts, obtaining the current CPU index, accessing a global variable, and re-enabling interrupts. Storing this_task in percpu makes retrieval faster.

Signed-off-by: liwenxiang1 <[email protected]>
@github-actions github-actions bot added Arch: x86_64 Issues related to the x86_64 architecture Size: S The size of the change in this PR is small labels Jan 2, 2025
@nuttxpr
Copy link

nuttxpr commented Jan 2, 2025

[Experimental Bot, please feedback here]

This PR does not fully meet the NuttX requirements, although it's close. Here's why and how to fix it:

Missing Information:

  • Summary: The summary provides a good high-level overview of why the change is made (performance improvement for this_task retrieval). However, it lacks detail on how it achieves this. Specifically, it should mention that the change moves this_task storage from a global variable to per-CPU storage using the CPU private data area.

  • Impact: While stating "no impact" is concise, it's too brief and might be inaccurate. Even performance improvements have an impact (positive in this case). The PR should at least acknowledge the positive performance impact on SMP systems. It should also consider and explicitly state whether there's any impact on single-CPU configurations (e.g., increased memory usage).

  • Testing: "ostest" is insufficient. The PR needs more details.

    • Build Host: Specify the OS, CPU architecture, and compiler version used for the build. (e.g., "Linux x86_64, GCC 12.2.0").
    • Target: Specify the target architecture and configuration. Since this affects x86_64, was it tested on a simulator (qemu-x86_64) or real hardware? Specify the board configuration if applicable. (e.g., "qemu-x86_64, default configuration").
    • Logs: The template specifically requests "Testing logs before change" and "Testing logs after change". Even if ostest passed in both cases, provide snippets of the relevant output, particularly anything demonstrating the performance improvement. If no specific log output demonstrates the improvement, explain how the performance gain was measured. (e.g., "Timed execution of a test application that frequently accesses this_task").

How to Fix the PR:

  1. Expand the Summary: Explain the technical implementation details (moving this_task to per-CPU storage using CPU private data).

  2. Clarify the Impact:

    • State the positive performance impact on SMP systems.
    • Address the impact (if any) on single-CPU configurations. Even if there's no significant impact, explicitly stating that is helpful.
  3. Provide Detailed Testing Information:

    • Specify the Build Host details.
    • Specify the Target details.
    • Include relevant log snippets or explain how the performance improvement was measured.

Example of Improved Sections:

Summary:

arch/x86_64: this_task is stored in the CPU private data. This change improves the performance of retrieving this_task in SMP configurations. Currently, accessing this_task requires disabling interrupts, obtaining the current CPU index, accessing a global variable, and re-enabling interrupts. This change moves the storage of this_task from a global variable to the per-CPU private data area, eliminating the need for the CPU index lookup and reducing overhead.

Impact:

  • Positive impact: Improved performance of this_task retrieval in SMP configurations.
  • Single-CPU impact: Negligible impact on single-CPU systems, potentially a minor increase in memory usage due to per-CPU storage allocation.
  • Other impacts: None.

Testing:

  • Build Host: Linux x86_64, GCC 12.2.0
  • Target: qemu-x86_64, default configuration

A microbenchmark was added to ostest to measure the time taken to retrieve this_task repeatedly.

Testing logs before change:

ostest: Microbenchmark: this_task retrieval (average): 125 ns

Testing logs after change:

ostest: Microbenchmark: this_task retrieval (average): 35 ns  

By adding these details, the PR will meet the NuttX requirements and be easier for reviewers to evaluate.

@xiaoxiang781216 xiaoxiang781216 merged commit 1fad0f1 into apache:master Jan 2, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: x86_64 Issues related to the x86_64 architecture Size: S The size of the change in this PR is small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants