Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix underflow in Sift4Common #8

Merged
merged 2 commits into from
Nov 19, 2024

Conversation

Palladinium
Copy link
Contributor

@Palladinium Palladinium commented Sep 23, 2024

I'm running the Sift4Common distance over results of OCR, which are often quite noisy. I've found some strings in the wild which cause a panic in debug mode, because of a subtraction with underflow.

I haven't actually fixed the bug yet (hence the draft status), but I've added some failing tests for the moment. I'll keep digging to identify the root cause and a fix.

@Palladinium Palladinium marked this pull request as draft September 23, 2024 06:51
@Palladinium Palladinium changed the title Fix underflow in Sift4Common (draft) Fix underflow in Sift4Common Sep 23, 2024
@Palladinium
Copy link
Contributor Author

Palladinium commented Sep 23, 2024

From a first look, it seems to me that the overflow occurs due to the optimization at line 100, wherein the two cursors have 1 subtracted from them in advance to account for the 1 that is added later in the loop. Because i can be 0, doing c1 += i - 1 underflows the right-hand side of the assignment.

I tried the easy fix of splitting the line into c1 += i; c1 -= 1;, but because c1 is zero, the overflow occurs anyways.

Ultimately, I think the result of the computation is the same, because the overflowed value gets 1 added to it immediately after, which overflows it back to 0. I can think of two solutions which don't require deviating significantly from the semantics of the original implementation:

  • Change the type of c1 and c2 to isize, and cast back to usize when indexing into s1 and s2
  • Wrap c1 and c2 in Wrapping, to suppress the panic

I'm not a fan of either, but I don't really see a better option.

@orsinium
Copy link
Member

orsinium commented Sep 23, 2024

Bring the fix that you see the best fit. If it passes all the tests, I'll be happy to merge. I don't remember the algorithm that well, so I'll trust your judgement.

@Palladinium Palladinium marked this pull request as ready for review November 19, 2024 04:01
@orsinium
Copy link
Member

Thank you :)

@orsinium orsinium merged commit cb7227c into life4:master Nov 19, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants