Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdb: add tool to check heap memory corruption #15726

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

XuNeo
Copy link
Contributor

@XuNeo XuNeo commented Jan 30, 2025

Note: Please adhere to Contributing Guidelines.

Summary

Only heap nodes are checked for now.

Usage:

(gdb) mm check
Check done.
(gdb)

For memory that has corruption, it will list the nodes.
The memory check command itself may also fail due to corruption, the node will also be printed. It is also enough for manually inspection.

Impact

New feature.

Testing

Update this section with details on how did you verify the change,
what Host was used for build (OS, CPU, compiler, ..), what Target was
used for verification (arch, board:config, ..), etc. Providing build
and runtime logs from before and after change is highly appreciated.

Only heap nodes are checked for now.

Signed-off-by: xuxingliang <[email protected]>
@github-actions github-actions bot added Area: Tooling Size: M The size of the change in this PR is medium labels Jan 30, 2025
@nuttxpr
Copy link

nuttxpr commented Jan 30, 2025

[Experimental Bot, please feedback here]

No, this PR description does not meet the NuttX requirements. Here's why and how to fix it:

Missing Information in Summary:

  • Why is this change necessary? The description only states what it does ("Only heap nodes are checked for now"). Explain the problem this solves. Is it for debugging? Improving memory safety? Does it address a specific bug or performance issue?
  • What functional part of the code is being changed? Be specific. Is this a new GDB command? Modifications to the MMU? Changes to the heap allocator itself?
  • How does the change exactly work? "It will list the nodes" is insufficient. Explain the mechanism. Does it walk the heap data structures? Does it use specific hardware features? What algorithms are employed?

Missing Information in Impact:

  • While it states "New Feature," all other impact sections are left with a simple "NO" or blank. Consider these carefully:
    • Impact on user: Will users need to learn a new GDB command? Will this change affect application behavior?
    • Impact on build: Will anything need to be configured to enable this feature?
    • Impact on hardware: Does this rely on specific hardware features? Does it work on all supported architectures?
    • Impact on documentation: A new feature definitely requires documentation updates. Explain where the documentation will be added/updated.
    • Impact on security: Does this improve memory safety, thus having a positive security impact? Or could it potentially introduce vulnerabilities?
    • Impact on compatibility: Does this break any existing functionality?

Missing Information in Testing:

  • Build Host(s): Provide details about the development environment: OS, CPU architecture, compiler version (e.g., "Linux x86_64, GCC 12.2").
  • Target(s): Specify the target architecture and board/configuration used for testing (e.g., "sim:qemu-x86_64," "stm32f4discovery:nsh").
  • Testing logs: The provided logs are placeholders. Include actual logs demonstrating the command's behavior both before and after the change. The "before" logs should ideally show the lack of this functionality or the problem being solved. The "after" logs should show the successful operation of the new feature. Show examples of both successful checks and the detection of corrupted heap nodes.

Example of Improved PR Description:

## Summary

This PR introduces a new GDB command, `mm check`, to check for heap memory corruption.  This is necessary to aid in debugging memory-related issues and improve overall system stability.  The change modifies the GDB stub and adds a new command handler.  The `mm check` command traverses the heap data structures, verifying the integrity of each allocated node. It checks for inconsistencies like invalid pointers or corrupted size fields. If corruption is detected, the command reports the address and details of the affected nodes.  The command may also fail if it encounters severe corruption that prevents further traversal.  Even in such cases, the command will attempt to report the corrupted node encountered before the failure.

## Impact

* **Is new feature added?** YES, the `mm check` GDB command.
* **Impact on user:** YES, users can now use the `mm check` command within GDB to debug heap corruption. This will require updating the NuttX documentation.
* **Impact on build:** NO.
* **Impact on hardware:** NO. This feature should be compatible with all architectures.
* **Impact on documentation:** YES, the GDB command reference will be updated to include documentation for `mm check`.  The documentation will explain the command's usage, output, and limitations.
* **Impact on security:** YES, this feature can help identify and diagnose memory corruption issues which could have security implications.
* **Impact on compatibility:** NO.

## Testing

I confirm that changes are verified on my local setup and work as intended:
* Build Host(s): Linux x86_64, GCC 12.2
* Target(s): sim:qemu-x86_64, stm32f4discovery:nsh

Testing logs before change:

(gdb) mm check
Undefined command: "mm check". Try "help".


Testing logs after change:

(gdb) mm check
Check done.
(gdb)

Example with simulated corruption (injected manually for testing):

(gdb) mm check
Heap corruption detected at address 0x12345678: Invalid size field.

(gdb) mm check # another simulated corruption example
Heap corruption detected at address 0x87654321: Invalid next pointer.
Check aborted due to severe corruption.


Documentation update will be provided in a separate commit.

By providing more specific details and actual test results, the PR will be much easier to review and understand. Remember to replace the example logs and details with your actual testing environment and results. Also, make sure to follow the guidelines in the CONTRIBUTING.md document.

@xiaoxiang781216 xiaoxiang781216 merged commit c1f9c73 into apache:master Jan 31, 2025
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Tooling Size: M The size of the change in this PR is medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants