Skip to content

[Feat] toploc2 #360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 12, 2025
Merged

[Feat] toploc2 #360

merged 18 commits into from
Jun 12, 2025

Conversation

Jackmin801
Copy link
Member

No description provided.

@Jackmin801 Jackmin801 marked this pull request as ready for review June 3, 2025 09:44
@Jackmin801
Copy link
Member Author

image graphs look ok

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new Toploc2Sampler for logit sampling, propagates per-sample seeds through the pipeline into Parquet outputs, and adjusts related tests and utilities to include the seed field.

  • Add Toploc2Sampler and switch to it when appropriate, disabling chunked prefill for correctness
  • Introduce and propagate a seed field in configs, inference logic, Parquet schema, and tests
  • Add a model validator to convert negative logprobs config values to None

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/integration/inference/test_debug_pp.py Standardize model name arguments
tests/conftest.py Add dummy seed column to test tables
src/zeroband/utils/parquet.py Extend Parquet schema with seed field
src/zeroband/inference/toploc2.py Implement the new Toploc2 sampling layer
src/zeroband/inference/parquet.py Pass seed values into Parquet records
src/zeroband/inference/config.py Validate and normalize logprobs setting
src/zeroband/infer.py Wire up Toploc2Sampler and seed logic
Comments suppressed due to low confidence (2)

src/zeroband/inference/toploc2.py:124

  • This TODO should be resolved or removed; if rank information is required for logprob metadata, implement a clear method to retrieve it and document the approach.
# TODO: How did the original code know the rank?

src/zeroband/inference/config.py:22

  • [nitpick] The method name and docstring refer to "negative logprobs," but the field is a count of logprobs; consider renaming to convert_negative_logprobs_count_to_none or clarifying that logprobs < 0 disables computation.
    def convert_negative_logprobs_to_none(self):

Copy link
Collaborator

@mikasenghaas mikasenghaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hell yea! nice job. have we tested that the produced output tokens are identical for the sampler and toploc sampler, ie. are we absolutely sure we are not altering model behavior? could be nice to have a simple test for this?

also, i think adding toploc 2 into the configs is also important. let's get this merged soon, so that i can rebase onto the config refactor:)

Comment on lines +21 to +26
@model_validator(mode="after")
def convert_negative_logprobs_to_none(self):
"""Convert negative logprobs values to None to disable logprobs calculation."""
if self.logprobs is not None and self.logprobs < 0:
self.logprobs = None
return self
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels more intuitive to err when passing negative values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is necessary to disable logprobs. I couldnt find a way to pass none and since the default is 0, it didnt seem like there was a way to make it None other than this

Copy link
Member

@samsja samsja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lfgtm v2

@Jackmin801
Copy link
Member Author

Should be ok. lets merge!
image

@Jackmin801 Jackmin801 merged commit 757b768 into main Jun 12, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants