Minor fixes for some formats where cmp_all() often returned non-zero #5782
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I tried benchmarking all formats with
bench.c
patched to alert whencmp_all
returns non-zero. Here's the list I got:It's not that bad. I expected it'd be the case for many more.
Out of these:
BestCryptVE4
had a bug (fixed here, so would no longer get on the above list)BKS
turned out to have frequent collisions (true or false positives? either way, we now dynamically setFMT_NOT_EXACT
)krb4
only has parity check incmp_all
and real computation and check incmp_one
(not changed here, but ideally we'd make bigger changes also adding OpenMP)openssl-enc
is known to have frequent false positives (no good fix)PKZIP
only checks short checksums, deferring further checking tocmp_exact
(this may be fine, albeit not optimal with OpenMP ifcmp_exact
is reached so often)Siemens-S7
pre-checks 32 bits incmp_all
and is fast, so it's just luck that one 32-bit collision occurred for it during my testing (that's fine, could also occur for many other formats - but maybe a reminder that we could preferARCH_WORD
or 64-bit checks now that we mostly care about performance on 64-bit archs)Also patched here is "BKS format: With SIMD, limit final scalar loops to actual key count", but I'm not sure we should. The same could apply to some other formats.
Other observations:
In
PKZIP
, thechk
array is always fully written to by all threads, which means the cache lines holding it are bounced between CPUs. We could want to add ourany_cracked
logic and conditionallymemset
the array instead.In
PKZIP
,cmp_all
uses an add loop unlike what we have in other formats. This optimizes for the case of no matches (so having to run the loop to the end anyway), but with high keys counts there are quite often some matches. Maybe this optimization is outdated by increasing key counts.