-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Order of rows in crosscheck metrics files are non-deterministic #1854
Comments
As always I suspect HashMap/Set |
Is it possible to fix it? |
Reopening, as this was not completely fixed by #1892 |
Remaining TODOs (courtesy of @lbergelson ): I looked at this a bit more, it seems like there is instability built in at the start since FingerPrintChecker.fingerprintFiles() builds an unsorted map from the output of a parallel and unpredictable process. I think it's going to require a more complicated change than just changing the other sets to linked sets. Either change the inputs to return their results in order, or sort the maps according to some criteria. I think we either need to change it so the input maps are built as the result of Futures instead of being added to the concurrent map as the are built. The errors are actually handled this way so it wouldn't be that hard to change. The other option would be to pick an arbitrary sort order and sort the input maps after they're built and then maintain that order. |
The order of rows output by CrosscheckFingerprints seems to change from run to run on the same input data. Not sure if this is new behavior or has always been the case.
This isn't a massive problem, but would be nice for the order to always be the same.
The text was updated successfully, but these errors were encountered: