Skip to content

Fix Evidence threshold bug #185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

vkakerbeck
Copy link
Contributor

There was a bug in the _get_evidence_update_threshold function in the EvidenceLM where instead of setting the threshold for upding a hypothesis to x_percent_threshold it was set to 1 - x_percent_threshold.

Results are as expected:

  • After the change is applied, accuracy and run time go down (green vs. grey) because we test way fewer hypotheses at every step.
  • If we set the evidence update threshold manually back to 80% instead of 20%, we get exactly the same results as before the fix (turquoise vs. grey). This is just a sanity check.

Screenshot 2025-02-21 at 5 06 34 PM

Given that the accuracy decreases so much after the fix, it seems like it would be good to create a separate variable for the evidence update threshold (if it is a percentage) instead of using the x_percent_threshold variable defined for the terminal condition. This PR includes a proposed way to add an option to specify evidence_update_threshold as a percentage.

I ran some hyperparameter tests for different values:
Screenshot 2025-02-21 at 5 08 39 PM

Remaining tasks before this can become a full PR:

  • Test this parameter on 77 object experiment
  • Test whether there is a different optimal x_percent_threshold value, given that these two parameters as disentangled now
  • Rerun all benchmarks & update results with the new values (if we set new values. Currently results should be exactly the same).

@vkakerbeck vkakerbeck added the bug Something isn't working label Feb 21, 2025
Copy link
Contributor

@nielsleadholm nielsleadholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good! Will review properly when it's a full PR but don't see any issue with the updated code.

), "Percentage must be between 0 and 100"
max_global_evidence = self.current_mlh["evidence"]
x_percent_of_max = max_global_evidence * (percentage / 100)
return max_global_evidence - x_percent_of_max
elif self.evidence_update_threshold == "x_percent_threshold":
Copy link
Contributor

@tristanls tristanls Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): Perhaps rename "x_percent_threshold" and the variable to something else? Now that evidence_update_threshold can take an "x percent" threshold value (e.g., "80%"), this name becomes quite confusing to parse.

note: I'm generally confused what x means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to do this in a separate PR (as it would touch a lot of the other code). Is that ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have mentioned that it was a non-blocking suggestion. I have no issues with that being a separate PR, just wanted to highlight my confusion.

@vkakerbeck vkakerbeck marked this pull request as ready for review February 21, 2025 18:59
@vkakerbeck
Copy link
Contributor Author

Just turned this into a real PR as it doesn’t affect the benchmark results. It just fixes the bug. Then we can do a follow-up PR with updating to optimal parameters and updating the benchmark results with those.

Copy link
Contributor

@nielsleadholm nielsleadholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for fixing this so quickly!

Copy link
Contributor

@hlee9212 hlee9212 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@vkakerbeck vkakerbeck merged commit a3db6f3 into thousandbrainsproject:main Feb 24, 2025
13 checks passed
nielsleadholm pushed a commit to nielsleadholm/tbp.monty that referenced this pull request Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants