Skip to content

Add support for gpt-4.1 and gpt-4.5 model variants to encoding maps #407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TaylorN15
Copy link

This PR adds support for newly released OpenAI model variants:

  • gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
  • gpt-4.5, gpt-4.5-mini, gpt-4.5-nano

These models are not currently recognized by tiktoken.encoding_for_model() and raise an exception. All use the o200k_base tokenizer, so this patch ensures they are mapped appropriately.

Changes:

  • Updated MODEL_TO_ENCODING for bare-name support (gpt-4.1, gpt-4.5)
  • Updated MODEL_PREFIX_TO_ENCODING for all suffix variants (gpt-4.1-*, gpt-4.5-*)
  • Extended test_encoding_for_model in test_misc.py to validate correct encoding resolution

Notes

  • These changes follow the existing mapping pattern for gpt-4o and similar models.
  • One unrelated test (test_hyp_roundtrip[cl100k_base]) fails due to tokenizer restrictions on special tokens (e.g., <|endofprompt|>), and is not impacted by this PR.

TaylorN15 added 2 commits June 5, 2025 05:46
- Add support for "gpt-4.1" and "gpt-4.5" with "o200k_base" encoding
- Refactor test to include multiple models for encoding validation
@TaylorN15
Copy link
Author

I didn't notice #396, but this PR handles 4.1 and 4.5 and related tests.

@TaylorN15
Copy link
Author

@hauntsaninja FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant