This project implements a Cuckoo Filter, inspired by the paper "Cuckoo Filter: Practically Better Than Bloom" by Bin Fan, David G. Andersen, and Michael Kaminsky (CoNEXT 2014).
go get github.com/wongzc/go-cuckoo-filter/cuckoofilter
A Cuckoo Filter is a probabilistic data structure used for set membership tests, similar to a Bloom Filter—but with added support for deletions, better lookup performance, and improved space efficiency when the desired false positive rate (FPR) is under 3%.
Instead of storing the full item, it stores a short fingerprint of each item in one of two possible buckets. Like Cuckoo Hashing, it may relocate existing entries to make space for new ones.
- Supports dynamic add and delete operations.
- Faster lookups than Bloom Filters, even at 95% capacity.
- Simpler to implement than Bloom Filter variants that support deletion.
- More space-efficient than Bloom Filters when FPR < 3%.
- Uses 1 hash functions.
- Each item has two candidate buckets:
h1(x)
andh2(x) = h1(x) ⊕ hash(fingerprint(x))
. - If both buckets are full, evict an existing item (Cuckooing), and recursively reinsert the evicted one.
- Each bucket may store multiple entries (typically 2–8).
- Average insertion time is O(1).
- Only stores a short fingerprint of each item.
- Fingerprint size is based on the desired false positive rate (ε).
- Insertion uses partial-key Cuckoo hashing, relying only on the fingerprint.
- Compute fingerprint
f = fingerprint(x)
. - Check if
f
exists in eitherh1(x)
orh2(x)
. - Guarantees no false negatives unless an insertion failed due to full capacity.
- Check both buckets for the fingerprint.
- Remove it if found.
- No risk of false deletion:
- If multiple items share a fingerprint, their alternate bucket will still retain one valid copy.
- FPR remains unchanged after deletion.
- Smaller
b
→ faster lookup. - Larger
b
→ higher load factor (more space efficiency).
ε
: false positive rateb
: bucket sizef
: bits per fingerprint
n
: number of itemsα
: expected load factor (e.g. 0.84 for b=4)b
: bucket size- Round
m
up to the next power of 2 for fast modulo operations using bit masking.
n
: number of items- Used to cap the number of relocation attempts during insertion.
package main
import (
"fmt"
"github.com/wongzc/go-cuckoo-filter/cuckoofilter"
)
func main() {
cf := cuckoofilter.New(1000, 0.01, 4)
cf.Set("hello")
fmt.Println(cf.Get("hello")) // true
fmt.Println(cf.Get("world")) // probably false
cf.Del("hello")
fmt.Println(cf.Get("hello")) // false
}