arm64: Add BRBE support for bpf_get_branch_snapshot()#11396
Open
kernel-patches-daemon-bpf[bot] wants to merge 3 commits intobpf_basefrom
Open
arm64: Add BRBE support for bpf_get_branch_snapshot()#11396kernel-patches-daemon-bpf[bot] wants to merge 3 commits intobpf_basefrom
kernel-patches-daemon-bpf[bot] wants to merge 3 commits intobpf_basefrom
Conversation
This is easily triggered with:
perf record -b -e cycles -a -- ls
which crashes on the first context switch with:
Unable to handle kernel NULL pointer dereference at virtual address 00[.]
PC is at armv8pmu_sched_task+0x14/0x50
LR is at perf_pmu_sched_task+0xac/0x108
Call trace:
armv8pmu_sched_task+0x14/0x50 (P)
perf_pmu_sched_task+0xac/0x108
__perf_event_task_sched_out+0x6c/0xe0
prepare_task_switch+0x120/0x268
__schedule+0x1e8/0x828
...
perf_pmu_sched_task() invokes the PMU sched callback with cpc->task_epc,
which is NULL when no per-task events exist for this PMU. With CPU-wide
branch-stack events, armv8pmu_sched_task() is still registered and
dereferences pmu_ctx->pmu unconditionally, causing the crash.
The bug was introduced by commit fa9d277 ("perf: arm_pmu: Kill last
use of per-CPU cpu_armpmu pointer") which changed the function from
using the per-CPU cpu_armpmu pointer (always valid) to dereferencing
pmu_ctx->pmu without adding a NULL check.
Add a NULL check for pmu_ctx to avoid the crash.
Fixes: fa9d277 ("perf: arm_pmu: Kill last use of per-CPU cpu_armpmu pointer")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Implement the perf_snapshot_branch_stack static call for ARM's Branch Record Buffer Extension (BRBE), enabling the bpf_get_branch_snapshot() BPF helper on ARM64. This is a best-effort snapshot helper intended for tracing and debugging use. It favors non-invasive snapshotting over strong serialization, and returns 0 whenever a clean snapshot cannot be obtained. Nested invocations are not serialized; callers may observe a 0-length result when a clean snapshot cannot be preserved. BRBE is paused before the helper does any other work to avoid recording its own branches. The sysreg writes used to pause are branchless. local_daif_save() blocks local exception delivery while reading the buffer. If a PMU overflow raced before that point and re-enabled BRBE, the helper detects the cleared PAUSED state and returns 0. Branch records are read using perf_entry_from_brbe_regset() without event-specific filtering. The BPF program is responsible for applying its own filter criteria. The BRBE buffer is invalidated after reading to maintain contiguity for other consumers. Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
The get_branch_snapshot test checks that bpf_get_branch_snapshot() doesn't waste too many branch entries on infrastructure overhead. The threshold of < 10 was calibrated for x86 where about 7 entries are wasted. On ARM64, the BPF trampoline generates more branches than x86, resulting in about 13 wasted entries. The overhead comes from __bpf_prog_exit_recur which on ARM64 makes out-of-line calls to __rcu_read_unlock and generates more conditional branches than x86: [#24] dump_bpf_prog+0x118d0 -> __bpf_prog_exit_recur+0x0 [#23] __bpf_prog_exit_recur+0x78 -> __bpf_prog_exit_recur+0xf4 [#22] __bpf_prog_exit_recur+0xf8 -> __bpf_prog_exit_recur+0x80 [#21] __bpf_prog_exit_recur+0x80 -> __rcu_read_unlock+0x0 [#20] __rcu_read_unlock+0x24 -> __bpf_prog_exit_recur+0x84 [#19] __bpf_prog_exit_recur+0xe0 -> __bpf_prog_exit_recur+0x11c [#18] __bpf_prog_exit_recur+0x120 -> __bpf_prog_exit_recur+0xe8 [#17] __bpf_prog_exit_recur+0xf0 -> dump_bpf_prog+0x118d4 Increase the threshold to < 16 to accommodate ARM64. The test passes after the change: [root@(none) bpf]# ./test_progs -t get_branch_snapshot #136 get_branch_snapshot:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Author
|
Upstream branch: e06e6b8 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: arm64: Add BRBE support for bpf_get_branch_snapshot()
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1066521