Skip to content

Feature Request: Integrate Min-P Sampling Strategy into Gemma #325

Open
@mehulbafnaa

Description

@mehulbafnaa

Description
Add a new decoding strategy, Min-P Sampling, to the Gemma text generation API. Min-P Sampling is designed to improve output diversity and coherence by selecting tokens based on the minimum cumulative probability threshold, striking a balance between deterministic and fully stochastic methods.


Motivation

  • Enhanced Diversity & Creativity: Provides a tunable trade-off between randomness and quality, outperforming Top-k and Top-p in certain creative generation tasks.
  • Reduced Repetition: Empirically shown to mitigate looping and repetitive token generation common with other samplers.
  • User Control: Offers a single parameter p that is intuitive and consistent with existing sampling APIs.

Proposed Implementation

  1. New Class

    • Create MinPSampling under gemma.gm.text, mirroring the API of Greedy, TopkSampling, and ToppSampling.
  2. Decoder Logic

    • Compute the smallest prefix of the sorted token-probability list whose cumulative probability ≥ p.
    • Sample the next token uniformly from this “min-p” set.
  3. Testing

    • Unit tests covering edge cases (p=0.0, p=1.0, very small vocabularies).
    • Integration tests comparing output distributions against reference implementation.

References

Implementation PR: #320

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions