You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space.
6
6
@@ -10,10 +10,23 @@ For details about the algorithm and citations please use this article for now
10
10
11
11
["Cuckoo Filter: Better Than Bloom" by Bin Fan, Dave Andersen and Michael Kaminsky](https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf)
12
12
13
-
## Note
14
-
This implementation uses a a static bucket size of 4 fingerprints and a fingerprint size of 1 byte based on my understanding of an optimal bucket/fingerprint/size ratio from the aforementioned paper.
13
+
## Implementation details
14
+
15
+
The paper cited above leaves several parameters to choose. In this implementation
16
+
17
+
1. Every element has 2 possible bucket indices
18
+
2. Buckets have a static size of 4 fingerprints
19
+
3. Fingerprints have a static size of 16 bits
20
+
21
+
1 and 2 are suggested to be the optimum by the authors. The choice of 3 comes down to the desired false positive rate. Given a target false positive rate of `r` and a bucket size `b`, they suggest choosing the fingerprint size `f` using
22
+
23
+
f >= log2(2b/r) bits
24
+
25
+
With the 16 bit fingerprint size in this repository, you can expect `r ~= 0.0001`.
26
+
[Other implementations](https://github.com/seiflotfy/cuckoofilter) use 8 bit, which correspond to a false positive rate of `r ~= 0.03`.
0 commit comments