Ultra-fast Fibonacci encoding/decoding methods. The implementation is based on the Fast Fibonacci Encoding Algorithm and the The Fast Fibonacci Decompression Algorithm, with additional optimizations to achieve even better performance. It is 10–30 times faster than the conventional algorithm (depending on the value distribution).
Note
All values are internally incremented during encoding and decremented during decoding to allow the representation of zero. The maximum supported value is
using static FastFibCode.Fibonacci;
UInt128 codeForULong = EncodeULong(0x0102030405060708);
ulong codeForUInt = EncodeUInt(0x01020304);
uint codeForShort = EncodeUShort(0x0102);
ushort codeForByte = EncodeByte(0x01);
MyFavoriteBitWriter bw = ...;
// assume lower bits are written first
bw.WriteBits(codeForULong, (int)UInt128.Log2(codeForULong) + 1);
bw.WriteBit(1); // "delimiter"
bw.WriteBits(codeForUInt, BitOperations.Log2(codeForUInt) + 1);
bw.WriteBit(1);
bw.WriteBits(codeForShort, BitOperations.Log2(codeForShort) + 1);
bw.WriteBit(1);
bw.WriteBits(codeForByte, BitOperations.Log2(codeForByte) + 1);
bw.WriteBit(1);
// =============================================================
MyFavoriteBitReader br = ...;
UInt128 code = br.ReadUpToDoubleOnes(); // with final '11'
code ^= UInt128.One << (int)UInt128.Log2(codeForULong); // clear the delimiter '1' bit
ulong value = DecodeAsULong(code);
// ...A comparison with conventional encoding/decoding algorithms was performed for various value distributions: Folded Normal (absolute value of a normal distribution), Exponential distribution, and Uniform distribution.
Each test processed an array of
The conventional algorithms used for comparison
ulong EncodeUInt(uint value, bool increment)
{
ulong v = value;
if (increment) v++;
ulong result = 0;
var log2 = BitOperations.Log2(v);
var maxIndex = log2 + (log2 >> 1) + 1;
maxIndex = Math.Min(maxIndex, 45);
for (int i = maxIndex; i >= 0; i--)
{
if (v == 0) return result;
var f = (uint)F[i];
if (v >= f)
{
result |= 1UL << i;
v -= f;
i--;
}
}
return result;
}
uint DecodeAsUInt(ulong code, bool decrement)
{
uint result = 0;
var maxIndex = BitOperations.Log2(code) + 1;
maxIndex = Math.Min(maxIndex, 45);
for (int i = 0; i <= maxIndex; i++)
{
var f = (uint)F[i];
if ((code & (1UL << i)) != 0)
{
result += f;
i++;
}
}
if (decrement) result--;
return result;
}Encoding benchmark results
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 9.0.302
[Host] : .NET 9.0.7 (9.0.725.31616), X64 RyuJIT AVX2
Job-ZJFXOJ : .NET 9.0.7 (9.0.725.31616), X64 RyuJIT AVX2
| Method | Distribution | Mean | Error | StdDev | Median | Ratio | RatioSD |
|---|---|---|---|---|---|---|---|
| EncodeConventional | FoldedNormal_100 | 39.102 ms | 0.7629 ms | 0.8163 ms | 38.901 ms | 1.00 | 0.03 |
| EncodeFast | FoldedNormal_100 | 1.542 ms | 0.0557 ms | 0.1607 ms | 1.489 ms | 0.04 | 0.00 |
| EncodeConventional | FoldedNormal_1K | 54.080 ms | 1.0371 ms | 1.2346 ms | 53.907 ms | 1.00 | 0.03 |
| EncodeFast | FoldedNormal_1K | 2.906 ms | 0.0506 ms | 0.0846 ms | 2.902 ms | 0.05 | 0.00 |
| EncodeConventional | FoldedNormal_10K | 69.528 ms | 1.2526 ms | 1.1716 ms | 69.309 ms | 1.00 | 0.02 |
| EncodeFast | FoldedNormal_10K | 4.413 ms | 0.0856 ms | 0.1083 ms | 4.424 ms | 0.06 | 0.00 |
| EncodeConventional | FoldedNormal_100K | 84.035 ms | 1.5328 ms | 1.5055 ms | 84.026 ms | 1.00 | 0.02 |
| EncodeFast | FoldedNormal_100K | 3.906 ms | 0.0746 ms | 0.0944 ms | 3.898 ms | 0.05 | 0.00 |
| EncodeConventional | FoldedNormal_1M | 98.920 ms | 1.8442 ms | 1.7251 ms | 98.421 ms | 1.00 | 0.02 |
| EncodeFast | FoldedNormal_1M | 3.941 ms | 0.0776 ms | 0.0924 ms | 3.958 ms | 0.04 | 0.00 |
| EncodeConventional | FoldedNormal_10M | 114.072 ms | 1.8719 ms | 1.7510 ms | 114.155 ms | 1.00 | 0.02 |
| EncodeFast | FoldedNormal_10M | 8.203 ms | 0.1287 ms | 0.1264 ms | 8.207 ms | 0.07 | 0.00 |
| EncodeConventional | Exponential_100 | 35.178 ms | 0.6949 ms | 1.1611 ms | 34.747 ms | 1.00 | 0.05 |
| EncodeFast | Exponential_100 | 1.537 ms | 0.0385 ms | 0.1092 ms | 1.509 ms | 0.04 | 0.00 |
| EncodeConventional | Exponential_1K | 51.469 ms | 0.9935 ms | 1.0203 ms | 51.420 ms | 1.00 | 0.03 |
| EncodeFast | Exponential_1K | 2.369 ms | 0.0456 ms | 0.0856 ms | 2.371 ms | 0.05 | 0.00 |
| EncodeConventional | Exponential_10K | 65.755 ms | 1.1349 ms | 1.0616 ms | 65.705 ms | 1.00 | 0.02 |
| EncodeFast | Exponential_10K | 5.003 ms | 0.0986 ms | 0.1211 ms | 4.995 ms | 0.08 | 0.00 |
| EncodeConventional | Exponential_100K | 80.600 ms | 1.2722 ms | 1.1900 ms | 80.906 ms | 1.00 | 0.02 |
| EncodeFast | Exponential_100K | 4.007 ms | 0.0782 ms | 0.0931 ms | 4.004 ms | 0.05 | 0.00 |
| EncodeConventional | Exponential_1M | 95.661 ms | 1.5023 ms | 1.3317 ms | 95.957 ms | 1.00 | 0.02 |
| EncodeFast | Exponential_1M | 3.930 ms | 0.0757 ms | 0.1086 ms | 3.924 ms | 0.04 | 0.00 |
| EncodeConventional | Exponential_10M | 110.287 ms | 1.8801 ms | 1.7587 ms | 110.172 ms | 1.00 | 0.02 |
| EncodeFast | Exponential_10M | 8.518 ms | 0.1262 ms | 0.1596 ms | 8.490 ms | 0.08 | 0.00 |
| EncodeConventional | Uniform_5M | 103.441 ms | 1.3436 ms | 1.2568 ms | 103.713 ms | 1.00 | 0.02 |
| EncodeFast | Uniform_5M | 3.917 ms | 0.0766 ms | 0.1321 ms | 3.908 ms | 0.04 | 0.00 |
Decoding benchmark results
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 9.0.302
[Host] : .NET 9.0.7 (9.0.725.31616), X64 RyuJIT AVX2
Job-ZJFXOJ : .NET 9.0.7 (9.0.725.31616), X64 RyuJIT AVX2
| Method | Distribution | Mean | Error | StdDev | Ratio | RatioSD |
|---|---|---|---|---|---|---|
| DecodeConventional | FoldedNormal_100 | 35.279 ms | 0.3204 ms | 0.2676 ms | 1.00 | 0.01 |
| DecodeFast | FoldedNormal_100 | 1.813 ms | 0.0450 ms | 0.1319 ms | 0.05 | 0.00 |
| DecodeConventional | FoldedNormal_1K | 48.935 ms | 0.9005 ms | 0.7520 ms | 1.00 | 0.02 |
| DecodeFast | FoldedNormal_1K | 3.008 ms | 0.0599 ms | 0.0531 ms | 0.06 | 0.00 |
| DecodeConventional | FoldedNormal_10K | 61.887 ms | 1.1508 ms | 1.1302 ms | 1.00 | 0.03 |
| DecodeFast | FoldedNormal_10K | 3.595 ms | 0.0717 ms | 0.1095 ms | 0.06 | 0.00 |
| DecodeConventional | FoldedNormal_100K | 74.222 ms | 1.4658 ms | 1.4396 ms | 1.00 | 0.03 |
| DecodeFast | FoldedNormal_100K | 2.686 ms | 0.0528 ms | 0.1191 ms | 0.04 | 0.00 |
| DecodeConventional | FoldedNormal_1M | 86.271 ms | 1.3631 ms | 1.2750 ms | 1.00 | 0.02 |
| DecodeFast | FoldedNormal_1M | 2.574 ms | 0.0510 ms | 0.0893 ms | 0.03 | 0.00 |
| DecodeConventional | FoldedNormal_10M | 99.897 ms | 1.5131 ms | 1.3413 ms | 1.00 | 0.02 |
| DecodeFast | FoldedNormal_10M | 5.395 ms | 0.1069 ms | 0.1664 ms | 0.05 | 0.00 |
| DecodeConventional | Exponential_100 | 31.101 ms | 0.6039 ms | 0.7637 ms | 1.00 | 0.03 |
| DecodeFast | Exponential_100 | 1.773 ms | 0.0384 ms | 0.1127 ms | 0.06 | 0.00 |
| DecodeConventional | Exponential_1K | 46.007 ms | 0.9017 ms | 0.9260 ms | 1.00 | 0.03 |
| DecodeFast | Exponential_1K | 2.551 ms | 0.0508 ms | 0.1135 ms | 0.06 | 0.00 |
| DecodeConventional | Exponential_10K | 59.651 ms | 0.9695 ms | 0.8595 ms | 1.00 | 0.02 |
| DecodeFast | Exponential_10K | 4.371 ms | 0.0815 ms | 0.0800 ms | 0.07 | 0.00 |
| DecodeConventional | Exponential_100K | 72.305 ms | 0.9786 ms | 0.7640 ms | 1.00 | 0.01 |
| DecodeFast | Exponential_100K | 2.782 ms | 0.0553 ms | 0.0925 ms | 0.04 | 0.00 |
| DecodeConventional | Exponential_1M | 85.477 ms | 1.5396 ms | 1.4402 ms | 1.00 | 0.02 |
| DecodeFast | Exponential_1M | 2.621 ms | 0.0514 ms | 0.0815 ms | 0.03 | 0.00 |
| DecodeConventional | Exponential_10M | 96.975 ms | 1.9232 ms | 1.7990 ms | 1.00 | 0.03 |
| DecodeFast | Exponential_10M | 6.630 ms | 0.1161 ms | 0.1192 ms | 0.07 | 0.00 |
| DecodeConventional | Uniform_5M | 91.944 ms | 1.7965 ms | 2.2063 ms | 1.00 | 0.03 |
| DecodeFast | Uniform_5M | 2.635 ms | 0.0519 ms | 0.0881 ms | 0.03 | 0.00 |