Kernel: Remove misleading FIXME in memcpy #26531
Open
+0
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Removes the misleading FIXME comment in kernel
memcpyabout supporting unaligned addresses. Benchmarking shows the current implementation is already optimal for typical kernel usage patterns.Motivation
The FIXME suggested that unaligned address handling needed improvement. However, comprehensive benchmarks reveal:
The performance issue only manifests for large (≥4KB) page-crossing copies with unaligned destinations, which represent ~4% of kernel usage and would require significant complexity to optimize.
Key Benchmark Findings
Small Copies (64-512 bytes) - Current Code Wins
Large Page-Crossing Copies (4KB) - Performance Cliff
Why the difference?
rep movsbfalls back to byte-by-byte copying across the page boundaryKernel Usage Distribution
Decision Rationale
While there is a legitimate 14x performance issue for 4KB unaligned copies, fixing it would:
The FIXME is misleading because it implies the current code is deficient for general unaligned handling, when it's actually optimal for typical usage.
Future work: If page copy performance becomes critical, a size-based threshold could use alignment only for large copies (≥2KB).
Detailed Benchmark Results
Click to expand full benchmark data
Test Environment
Implementations Tested
rep movsqTest Scenarios
Results by Size
SIZE = 64 bytes
SIZE = 128 bytes
SIZE = 256 bytes
SIZE = 512 bytes
SIZE = 1024 bytes
SIZE = 4096 bytes (ONE PAGE)⚠️
Analysis Summary
Small copies dominate kernel usage (80% are <256 bytes)
Medium copies (256B-1KB) still favor current approach
Large page-crossing copies (4KB+) show severe degradation
rep movsbacross page boundariesMicroarchitectural behavior
Benchmark Code
Click to expand benchmark source