B4/rhash by mykyta5 · Pull Request #11367 · kernel-patches/bpf

mykyta5 · 2026-03-11T23:22:57Z

No description provided.

This patch series introduces BPF_MAP_TYPE_RHASH, a new hash map type that leverages the kernel's rhashtable to provide resizable hash map for BPF. The existing BPF_MAP_TYPE_HASH uses a fixed number of buckets determined at map creation time. While this works well for many use cases, it presents challenges when: 1. The number of elements is unknown at creation time 2. The element count varies significantly during runtime 3. Memory efficiency is important (over-provisioning wastes memory, under-provisioning hurts performance) BPF_MAP_TYPE_RHASH addresses these issues by using rhashtable, which automatically grows and shrinks based on load factor. The implementation wraps the kernel's rhashtable with BPF map operations: - Uses bpf_mem_alloc for RCU-safe memory management - Supports all standard map operations (lookup, update, delete, get_next_key) - Supports batch operations (lookup_batch, lookup_and_delete_batch) - Supports BPF iterators for traversal - Supports BPF_F_LOCK for spin locks in values - Requires BPF_F_NO_PREALLOC flag (elements allocated on demand) - max_entries serves as a hard limit, not bucket count The series includes comprehensive tests: - Basic operations in test_maps (lookup, update, delete, get_next_key) - BPF program tests for lookup/update/delete semantics - BPF_F_LOCK tests with concurrent access - Stress tests for get_next_key during concurrent resize operations - Seq file tests Signed-off-by: Mykyta Yatsenko <[email protected]> --- Current implementation of the BPF_MAP_TYPE_RHASH does not provide the same strong guarantees on the values consistency under concurrent reads/writes as BPF_MAP_TYPE_HASH. BPF_MAP_TYPE_HASH allocates a new element and atomically swaps the pointer, so RCU readers always see a complete value. BPF_MAP_TYPE_RHASH does memcpy in place with no lock held. rhash trades consistency for speed (5x improvement in update benchmark): concurrent readers can observe partially updated data. Two concurrent writers to the same key can also interleave, producing mixed values. As a solution, user may use BPF_F_LOCK to guarantee consistent reads and write serialization. Summary of the read consistency guarantees: map type | write mechanism | read consistency -------------+------------------+-------------------------- htab | alloc, swap ptr | always consistent (RCU) htab F_LOCK | in-place + lock | consistent if reader locks -------------+------------------+-------------------------- rhtab | in-place memcpy | torn reads rhtab F_LOCK | in-place + lock | consistent if reader locks Changes in v2: - Added benchmarks - Link to v1: https://lore.kernel.org/r/[email protected] --- b4-submit-tracking --- { "series": { "revision": 2, "change-id": "20251103-rhash-7b70069923d8", "prefixes": [ "RFC bpf-next" ], "history": { "v1": [ "[email protected]" ] } } }

Add resizable hash map into enums where it is needed. Signed-off-by: Mykyta Yatsenko <[email protected]>

Introduce basic operations for BPF_MAP_TYPE_RHASH, a new hash map type built on top of the kernel's rhashtable. Key implementation details: - Uses rhashtable for automatic resizing with RCU-safe operations - Elements allocated via bpf_mem_alloc for lock-free allocation - Supports BPF_F_LOCK for spin_lock protected values - Requires BPF_F_NO_PREALLOC Implemented map operations: * map_alloc/map_free: Initialize and destroy the rhashtable * map_lookup_elem: RCU-protected lookup via rhashtable_lookup * map_update_elem: Insert or update with BPF_NOEXIST/EXIST/ANY * map_delete_elem: Remove element with RCU-deferred freeing * map_get_next_key: Returns the next key in the table * map_release_uref: Free internal structs (timers, workqueues) Other operations (batch, seq file) are implemented in the next patch Signed-off-by: Mykyta Yatsenko <[email protected]>

Add batch operations and BPF iterator support for BPF_MAP_TYPE_RHASH. Batch operations: * rhtab_map_lookup_batch: Bulk lookup of elements by bucket * rhtab_map_lookup_and_delete_batch: Atomic bulk lookup and delete The batch implementation iterates through buckets under RCU protection, copying keys and values to userspace buffers. When the buffer fills mid-bucket, it rolls back to the bucket boundary so the next call can retry that bucket completely. BPF iterator: * Uses rhashtable_walk_* API for safe iteration * Handles -EAGAIN during table resize transparently * Tracks skip_elems to resume iteration across read() calls Also implements rhtab_map_mem_usage() to report memory consumption. Signed-off-by: Mykyta Yatsenko <[email protected]>

Signed-off-by: Mykyta Yatsenko <[email protected]>

Test basic map operations (lookup, update, delete) for BPF_MAP_TYPE_RHASH including boundary conditions like duplicate key insertion and deletion of nonexistent keys. Signed-off-by: Mykyta Yatsenko <[email protected]>

Signed-off-by: Mykyta Yatsenko <[email protected]>

Add tests validating resizable hash map handles BPF_F_LOCK flag as expected. Signed-off-by: Mykyta Yatsenko <[email protected]>

Test get_next_key behavior under concurrent modification: * Resize test: verify all elements visited after resize trigger * Stress test: concurrent iterators and modifiers to detect races Signed-off-by: Mykyta Yatsenko <[email protected]>

Test BPF iterator functionality for BPF_MAP_TYPE_RHASH: * Basic iteration verifying all elements are visited * Overflow test triggering seq_file restart, validating correct resume behavior via skip_elems tracking Signed-off-by: Mykyta Yatsenko <[email protected]>

Make bpftool documentation aware of the resizable hash map. Signed-off-by: Mykyta Yatsenko <[email protected]>

Support resizable hashmap in BPF map benchmarks. Results: $ sudo ./bench -w3 -d10 -a bpf-rhashmap-full-update 0:hash_map_full_perf 21641414 events per sec $ sudo ./bench -w3 -d10 -a bpf-hashmap-full-update 0:hash_map_full_perf 4392758 events per sec $ sudo ./bench -w3 -d10 -a -p8 htab-mem --use-case overwrite --value-size 8 Iter 0 (302.834us): per-prod-op 62.85k/s, memory usage 2.70MiB Iter 1 (-44.810us): per-prod-op 62.81k/s, memory usage 2.70MiB Iter 2 (-45.821us): per-prod-op 62.81k/s, memory usage 2.70MiB Iter 3 (-63.658us): per-prod-op 62.92k/s, memory usage 2.70MiB Iter 4 ( 32.887us): per-prod-op 62.85k/s, memory usage 2.70MiB Iter 5 (-76.948us): per-prod-op 62.75k/s, memory usage 2.70MiB Iter 6 (157.235us): per-prod-op 63.01k/s, memory usage 2.70MiB Iter 7 (-118.761us): per-prod-op 62.85k/s, memory usage 2.70MiB Iter 8 (127.139us): per-prod-op 62.92k/s, memory usage 2.70MiB Iter 9 (-169.908us): per-prod-op 62.99k/s, memory usage 2.70MiB Iter 10 (101.962us): per-prod-op 62.97k/s, memory usage 2.70MiB Iter 11 (-64.330us): per-prod-op 63.05k/s, memory usage 2.70MiB Iter 12 (-20.543us): per-prod-op 62.86k/s, memory usage 2.70MiB Iter 13 ( 55.382us): per-prod-op 62.95k/s, memory usage 2.70MiB Summary: per-prod-op 62.92 ± 0.09k/s, memory usage 2.70 ± 0.00MiB, peak memory usage 2.96MiB $ sudo ./bench -w3 -d10 -a -p8 rhtab-mem --use-case overwrite --value-size 8 Iter 0 (316.805us): per-prod-op 96.40k/s, memory usage 2.71MiB Iter 1 (-35.225us): per-prod-op 96.54k/s, memory usage 2.71MiB Iter 2 (-12.431us): per-prod-op 96.54k/s, memory usage 2.71MiB Iter 3 (-56.537us): per-prod-op 96.58k/s, memory usage 2.71MiB Iter 4 ( 27.108us): per-prod-op 96.62k/s, memory usage 2.71MiB Iter 5 (-52.491us): per-prod-op 96.57k/s, memory usage 2.71MiB Iter 6 ( -2.777us): per-prod-op 96.52k/s, memory usage 2.71MiB Iter 7 (108.963us): per-prod-op 96.45k/s, memory usage 2.71MiB Iter 8 (-61.575us): per-prod-op 96.48k/s, memory usage 2.71MiB Iter 9 (-21.595us): per-prod-op 96.14k/s, memory usage 2.71MiB Iter 10 ( 3.243us): per-prod-op 96.36k/s, memory usage 2.71MiB Iter 11 ( 3.102us): per-prod-op 94.70k/s, memory usage 2.71MiB Iter 12 (109.102us): per-prod-op 95.77k/s, memory usage 2.71MiB Iter 13 ( 16.153us): per-prod-op 95.91k/s, memory usage 2.71MiB Summary: per-prod-op 96.19 ± 0.57k/s, memory usage 2.71 ± 0.00MiB, peak memory usage 2.71MiB sudo ./bench -w3 -d10 -a bpf-hashmap-lookup --key_size 4\ --max_entries 1000 --nr_entries 500 --nr_loops 1000000 cpu00: lookup 28.603M ± 0.536M events/sec (approximated from 32 samples of ~34ms) sudo ./bench -w3 -d10 -a bpf-rhashmap-lookup --key_size 4\ --max_entries 1000 --nr_entries 500 --nr_loops 1000000 cpu00: lookup 27.340M ± 0.864M events/sec (approximated from 32 samples of ~36ms) Signed-off-by: Mykyta Yatsenko <[email protected]>

Signed-off-by: Mykyta Yatsenko <[email protected]>

mykyta5 added 13 commits March 11, 2026 16:22

bpf: Register rhash map

30a2ec6

Add resizable hash map into enums where it is needed. Signed-off-by: Mykyta Yatsenko <[email protected]>

libbpf: Support resizable hashtable

97a8d78

Signed-off-by: Mykyta Yatsenko <[email protected]>

selftests/bpf: Add basic tests for resizable hash map

813ce6a

Test basic map operations (lookup, update, delete) for BPF_MAP_TYPE_RHASH including boundary conditions like duplicate key insertion and deletion of nonexistent keys. Signed-off-by: Mykyta Yatsenko <[email protected]>

selftests/bpf: Support resizable hashtab in test_maps

e2168c7

Signed-off-by: Mykyta Yatsenko <[email protected]>

selftests/bpf: Resizable hashtab BPF_F_LOCK tests

9e45292

Add tests validating resizable hash map handles BPF_F_LOCK flag as expected. Signed-off-by: Mykyta Yatsenko <[email protected]>

bpftool: Add rhash map documentation

2ba44e7

Make bpftool documentation aware of the resizable hash map. Signed-off-by: Mykyta Yatsenko <[email protected]>

Fix s390

9fda34e

Signed-off-by: Mykyta Yatsenko <[email protected]>

mykyta5 force-pushed the b4/rhash branch from 7389c01 to 9fda34e Compare March 12, 2026 16:20

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 9 times, most recently from 6f71402 to 3aabcc8 Compare March 17, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

B4/rhash#11367

B4/rhash#11367
mykyta5 wants to merge 13 commits intokernel-patches:bpf-next_basefrom
mykyta5:b4/rhash

mykyta5 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mykyta5 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant