Skip to content

[Feature] Add train/eval/inference configs for OpenAI gpt-oss models #1906

@wizeng23

Description

@wizeng23

Feature request

Plan to support both 20B and 120B.

Upgrades we need to make on our side to support all recommended optimizations in the HF blog:

  • Job configs should use H100 GPUs at least, due to mxfp4 quant
  • Need torch 2.7/2.8 (we have 2.6) -- 2.8 is recommended, and was just released today
  • Need to upgrade our transformers version to 4.55
  • Also need accelerate, kernels, triton 3.4, and triton_kernels
  • Make sure we can support adjustable reasoning levels (seems to be set as the system prompt via the chat template)
  • If not using remote/vllm/HF inference, need to support Harmony response format.
  • Support Flash Attention 3 w/ attention sinks from vLLM. It seems the attn_implementation field was generalized to now support pulling in arbitrary kernels from HF Hub; we should add support for this
  • Ensure it works on vLLM. I get an error right now following their instructions here: TypeError: flash_attn_varlen_func() got an unexpected keyword argument 's_aux'

Motivation / references

https://huggingface.co/blog/welcome-openai-gpt-oss
#1661
https://openai.com/open-models/
https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers
https://github.com/huggingface/gpt-oss-recipes/blob/main/sft.py

Your contribution

PR

Metadata

Metadata

Assignees

Labels

FeatureenhancementNew feature or requesttriageThis issue needs review by the core team.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions