Skip to content

arm64: Add BRBE support for bpf_get_branch_snapshot()#11396

Open
kernel-patches-daemon-bpf[bot] wants to merge 3 commits intobpf_basefrom
series/1066521=>bpf
Open

arm64: Add BRBE support for bpf_get_branch_snapshot()#11396
kernel-patches-daemon-bpf[bot] wants to merge 3 commits intobpf_basefrom
series/1066521=>bpf

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: arm64: Add BRBE support for bpf_get_branch_snapshot()
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1066521

This is easily triggered with:

  perf record -b -e cycles -a -- ls

which crashes on the first context switch with:

  Unable to handle kernel NULL pointer dereference at virtual address 00[.]
  PC is at armv8pmu_sched_task+0x14/0x50
  LR is at perf_pmu_sched_task+0xac/0x108
  Call trace:
    armv8pmu_sched_task+0x14/0x50 (P)
    perf_pmu_sched_task+0xac/0x108
    __perf_event_task_sched_out+0x6c/0xe0
    prepare_task_switch+0x120/0x268
    __schedule+0x1e8/0x828
    ...

perf_pmu_sched_task() invokes the PMU sched callback with cpc->task_epc,
which is NULL when no per-task events exist for this PMU. With CPU-wide
branch-stack events, armv8pmu_sched_task() is still registered and
dereferences pmu_ctx->pmu unconditionally, causing the crash.

The bug was introduced by commit fa9d277 ("perf: arm_pmu: Kill last
use of per-CPU cpu_armpmu pointer") which changed the function from
using the per-CPU cpu_armpmu pointer (always valid) to dereferencing
pmu_ctx->pmu without adding a NULL check.

Add a NULL check for pmu_ctx to avoid the crash.

Fixes: fa9d277 ("perf: arm_pmu: Kill last use of per-CPU cpu_armpmu pointer")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Implement the perf_snapshot_branch_stack static call for ARM's Branch
Record Buffer Extension (BRBE), enabling the bpf_get_branch_snapshot()
BPF helper on ARM64.

This is a best-effort snapshot helper intended for tracing and debugging
use. It favors non-invasive snapshotting over strong serialization, and
returns 0 whenever a clean snapshot cannot be obtained. Nested
invocations are not serialized; callers may observe a 0-length result
when a clean snapshot cannot be preserved.

BRBE is paused before the helper does any other work to avoid recording
its own branches. The sysreg writes used to pause are branchless.
local_daif_save() blocks local exception delivery while reading the
buffer. If a PMU overflow raced before that point and re-enabled BRBE,
the helper detects the cleared PAUSED state and returns 0.

Branch records are read using perf_entry_from_brbe_regset() without
event-specific filtering. The BPF program is responsible for applying
its own filter criteria. The BRBE buffer is invalidated after reading
to maintain contiguity for other consumers.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
The get_branch_snapshot test checks that bpf_get_branch_snapshot()
doesn't waste too many branch entries on infrastructure overhead. The
threshold of < 10 was calibrated for x86 where about 7 entries are
wasted.

On ARM64, the BPF trampoline generates more branches than x86,
resulting in about 13 wasted entries. The overhead comes from
__bpf_prog_exit_recur which on ARM64 makes out-of-line calls to
__rcu_read_unlock and generates more conditional branches than x86:

  [#24] dump_bpf_prog+0x118d0       ->  __bpf_prog_exit_recur+0x0
  [#23] __bpf_prog_exit_recur+0x78  ->  __bpf_prog_exit_recur+0xf4
  [#22] __bpf_prog_exit_recur+0xf8  ->  __bpf_prog_exit_recur+0x80
  [#21] __bpf_prog_exit_recur+0x80  ->  __rcu_read_unlock+0x0
  [#20] __rcu_read_unlock+0x24      ->  __bpf_prog_exit_recur+0x84
  [#19] __bpf_prog_exit_recur+0xe0  ->  __bpf_prog_exit_recur+0x11c
  [#18] __bpf_prog_exit_recur+0x120 ->  __bpf_prog_exit_recur+0xe8
  [#17] __bpf_prog_exit_recur+0xf0  ->  dump_bpf_prog+0x118d4

Increase the threshold to < 16 to accommodate ARM64.

The test passes after the change:

 [root@(none) bpf]# ./test_progs -t get_branch_snapshot
 #136     get_branch_snapshot:OK
 Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: e06e6b8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1066521
version: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant