Skip to content

use xxhash instead of jenkins one-at-a-time hash, if available #2365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 5, 2025

Conversation

irfanHaslanded
Copy link
Contributor

@irfanHaslanded irfanHaslanded commented Mar 4, 2025

In several benchmarks, xxhash scores far better than the jenkins one-at-a-time hash function.

See xxhash.com for a comparison.

In several benchmarks, xxhash scores far better than the jenkins one at
a time hash function.

See https://xxhash.com/ for a comparison.
@irfanHaslanded irfanHaslanded changed the title use xxhash instead of jenkins hash use xxhash instead of jenkins one-at-a-time hash Mar 4, 2025
@irfanHaslanded
Copy link
Contributor Author

irfanHaslanded commented Mar 4, 2025

For exporting some data, and checking with perf report
with these changes

  Children      Self  Command     Shared Object         Symbol
+    1.03%     0.06%  sysrepocfg  libyang.so.3.8.0  [.] lyht_hash
     0.50%     0.20%  sysrepocfg  libyang.so.3.8.0  [.] lyht_hash_multi

without these changes

  Children      Self  Command     Shared Object         Symbol
+   13.09%    13.09%  sysrepocfg  libyang.so.3.8.0  [.] lyht_hash_multi
+   11.39%     0.05%  sysrepocfg  libyang.so.3.8.0  [.] lyht_hash

@michalvasko
Copy link
Member

Yes, for context creation I also measured about 12% improvement, which is nice. However, I also measured the other use-case, parsing data, where the improvement was 1%. With these numbers I am not sure what to do, we are working on a major context handling optimization so it will be even less relevant. Why did you start looking into this even, do you have a specific use-case where the hashing was taking too long?

@irfanHaslanded
Copy link
Contributor Author

Why did you start looking into this even, do you have a specific use-case where the hashing was taking too long?

Yes, the perf report data I gave above, is from a real device use-case, where hashing was taking 12% of the time. I think this depends on the type of data sampled then? If the data values are long strings, maybe the current hashing becomes a time consumer.

@michalvasko
Copy link
Member

What do you think about making the dependency optional? There are minimal code changes so it should not cause any mess and would avoid a new required dependency.

@irfanHaslanded irfanHaslanded changed the title use xxhash instead of jenkins one-at-a-time hash use xxhash instead of jenkins one-at-a-time hash, if available Mar 5, 2025
@irfanHaslanded
Copy link
Contributor Author

What do you think about making the dependency optional? There are minimal code changes so it should not cause any mess and would avoid a new required dependency.

Sounds good.

Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, I will do some minor refactoring changes myself, will be much faster, thanks.

@michalvasko michalvasko merged commit da81ee5 into CESNET:devel Mar 5, 2025
11 checks passed
@bradh352
Copy link
Contributor

bradh352 commented Mar 7, 2025

Just a comment on this, xxhash is usually better for long strings like checksums, not necessarily hash tables where it may be slower because of the higher initial overhead. I'd think most hashtable hashes are going to be less than 32 characters so a simpler hash usually wins.

I always gravitate towards FNV1a because its super fast, small, and has good distribution. I use it in c-ares here:
https://github.com/c-ares/c-ares/blob/main/src/lib/dsa/ares_htable.c#L418-L431

(note the seed to that implementation can be 0 if you're not worried about hash collision attacks)

I'm going to guess that the speed difference is likely due to distribution/collisions between the algorithms and not the actual hash performance itself.

I hate to see a dependency pulled in for something so minor. A good overview with visual representation and speed of some hashes is here:
https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed

smhasher also has some numbers
https://github.com/rurban/smhasher

Whatever tests that you ran, any chance you could re-test with FNV1a as an alternative?

@michalvasko
Copy link
Member

I hate to see a dependency pulled in for something so minor.

That is why it is optional.

But it would be great if @irfanHaslanded could try the FNV1 hash and see if it fixes the performance problems. If so, we can just use that.

@irfanHaslanded
Copy link
Contributor Author

Sure, I will try it out as well.

@irfanHaslanded
Copy link
Contributor Author

Seems quite slow, the same as one-at-a-time hash for data

+    9.50%     9.50%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash_multi
+    8.44%     0.00%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash

Compiled same as before with -O2 gcc flags

LIBYANG_API_DEF uint32_t
lyht_hash_multi(uint32_t hash, const char *key_part, size_t len)
{
    unsigned int hv = hash ^ 2166136261U;
    size_t i;

    if (key_part) {
        for (i = 0; i < len; i++) {
            hv ^= (unsigned int)key_part[i];
            /* hv *= 16777619 (0x01000193) */
            hv += (hv << 1) + (hv << 4) + (hv << 7) + (hv << 8) + (hv << 24);
        }
    }

    return hv;
}

LIBYANG_API_DEF uint32_t
lyht_hash(const char *key, size_t len)
{
    uint32_t hash;

    hash = lyht_hash_multi(0, key, len);
    return lyht_hash_multi(hash, NULL, len);
}

@bradh352
Copy link
Contributor

bradh352 commented Mar 10, 2025

Seems quite slow, the same as one-at-a-time hash for data

+    9.50%     9.50%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash_multi
+    8.44%     0.00%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash

Compiled same as before with -O2 gcc flags

LIBYANG_API_DEF uint32_t
lyht_hash_multi(uint32_t hash, const char *key_part, size_t len)
{
    unsigned int hv = hash ^ 2166136261U;
    size_t i;

    if (key_part) {
        for (i = 0; i < len; i++) {
            hv ^= (unsigned int)key_part[i];
            /* hv *= 16777619 (0x01000193) */
            hv += (hv << 1) + (hv << 4) + (hv << 7) + (hv << 8) + (hv << 24);
        }
    }

    return hv;
}

LIBYANG_API_DEF uint32_t
lyht_hash(const char *key, size_t len)
{
    uint32_t hash;

    hash = lyht_hash_multi(0, key, len);
    return lyht_hash_multi(hash, NULL, len);
}

lyht_hash() should just be return lyht_hash_multi(0, key, len), don't call lyht_hash_multi() twice. It likely messes up distribution.

Also, what is the context of the benchmark you're running? Meaning what else is being called? Is it saying the hash itself is taking up that much time from the entire test? How did the total execution time compare as well? Depending on the hash the overall run time may be better if there are fewer collisions, and so on...

@irfanHaslanded
Copy link
Contributor Author

irfanHaslanded commented Mar 10, 2025

lyht_hash() should just be return lyht_hash_multi(0, key, len), don't call lyht_hash_multi() twice. It likely messes up distribution.

The same test with updated lyht_hash(), did not make much of a difference either way.

LIBYANG_API_DEF uint32_t
lyht_hash(const char *key, size_t len)
{
    return lyht_hash_multi(0, key, len);
}

Is it saying the hash itself is taking up that much time from the entire test?

Yes.

How did the total execution time compare as well?

The overall time taken by fnv is not much different than one-at-a-time hash.

Depending on the hash the overall run time may be better if there are fewer collisions, and so on...

Is there any reason for the quality of xxhash to be inferior to fnv? If not, I wouldn't think that collisions would be significantly worse.

xxh

$ cat xxh.perf
Samples: 10K of event 'cycles', Event count (approx.): 7920116200
  Children      Self  Command     Shared Object     Symbol
+   12.66%    10.90%  sysrepocfg  libyang.so.3.8.2  [.] lyht_find_rec                                                                                                              ◆
+    6.47%     0.23%  sysrepocfg  libyang.so.3.8.2  [.] lyht_find_with_val_cb                                                                                                      ▒
+    5.74%     0.55%  sysrepocfg  libyang.so.3.8.2  [.] lyht_find                                                                                                                  ▒
+    3.20%     0.11%  sysrepocfg  libyang.so.3.8.2  [.] lyht_resize                                                                                                                ▒
+    2.97%     1.09%  sysrepocfg  libyang.so.3.8.2  [.] _lyht_insert_with_resize_cb                                                                                                ▒
+    2.86%     0.18%  sysrepocfg  libyang.so.3.8.2  [.] lyht_init_hlists_and_records                                                                                               ▒
+    2.43%     0.12%  sysrepocfg  libyang.so.3.8.2  [.] lyht_remove_with_resize_cb                                                                                                 ▒
+    2.13%     0.56%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash_multi                                                                                                            ▒
+    2.02%     0.00%  sysrepocfg  libyang.so.3.8.2  [.] lyht_get_rec (inlined)                                                                                                     ▒
+    1.63%     0.17%  sysrepocfg  libyang.so.3.8.2  [.] lyht_free                                                                                                                  ▒
+    0.64%     0.05%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash                                                                                                                  ▒
+    0.54%     0.02%  sysrepocfg  libyang.so.3.8.2  [.] lyht_new                                                                                                                   ▒

fnv hash

$ cat fnv.perf
Samples: 10K of event 'cycles', Event count (approx.): 8401485919
  Children      Self  Command     Shared Object     Symbol
+   11.02%     9.55%  sysrepocfg  libyang.so.3.8.2  [.] lyht_find_rec
+    9.07%     9.06%  sysrepocfg  libyang.so.3.8.2  [.] lyht_hash_multi
+    5.52%     0.24%  sysrepocfg  libyang.so.3.8.2  [.] lyht_find_with_val_cb
+    4.91%     0.45%  sysrepocfg  libyang.so.3.8.2  [.] lyht_find
+    2.99%     0.22%  sysrepocfg  libyang.so.3.8.2  [.] lyht_resize
+    2.87%     0.80%  sysrepocfg  libyang.so.3.8.2  [.] _lyht_insert_with_resize_cb
+    2.55%     0.13%  sysrepocfg  libyang.so.3.8.2  [.] lyht_init_hlists_and_records
+    2.21%     0.09%  sysrepocfg  libyang.so.3.8.2  [.] lyht_remove_with_resize_cb
+    1.54%     0.00%  sysrepocfg  libyang.so.3.8.2  [.] lyht_get_rec (inlined)
+    1.36%     0.19%  sysrepocfg  libyang.so.3.8.2  [.] lyht_free
     0.45%     0.05%  sysrepocfg  libyang.so.3.8.2  [.] lyht_new

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants