Skip to content

CI fails: ValueError: Found unknown kwargs={'hidden_size': 32} #4210

@albertvillanova

Description

@albertvillanova

CI fails for all tests: https://github.com/huggingface/trl/actions/runs/18275955186/job/52028039921

 ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_transformers_bf16_kwargs - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_value_head - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_value_head_shape - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_value_head_init_random - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_value_head_not_str - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_from_save_trl - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_from_save_trl_sharded - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_from_save_transformers_sharded - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_from_save_transformers - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_inference - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_dropout_config - ValueError: Found unknown kwargs={'hidden_size': 32}
FAILED tests/test_modeling_value_head.py::TestCausalLMValueHeadModel::test_dropout_kwargs - ValueError: Found unknown kwargs={'hidden_size': 32}

Traceback:

tests/test_modeling_value_head.py:246: in test_generate
    model = self.trl_model_class.from_pretrained(model_name).to(self.device)
trl/models/modeling_base.py:214: in from_pretrained
    pretrained_model = cls.transformers_parent_class.from_pretrained(
.venv/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py:549: in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
.venv/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py:1321: in from_pretrained
    return config_class.from_dict(config_dict, **unused_kwargs)
.venv/lib/python3.9/site-packages/transformers/configuration_utils.py:808: in from_dict
    config = cls(**config_dict)
.venv/lib/python3.9/site-packages/transformers/models/dbrx/configuration_dbrx.py:209: in __init__
    self.ffn_config = DbrxFFNConfig(**ffn_config)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = DbrxFFNConfig {
  "ffn_act_fn": {
    "name": "silu"
  },
  "ffn_hidden_size": 10752,
  "moe_jitter_eps": 0,
  "moe_lo...  "moe_normalize_expert_weights": 1.0,
  "moe_num_experts": 16,
  "moe_top_k": 4,
  "transformers_version": "4.56.2"
}

ffn_act_fn = {'name': 'silu'}, ffn_hidden_size = 10752, moe_num_experts = 16
moe_top_k = 4, moe_jitter_eps = 0, moe_loss_weight = 0.05
moe_normalize_expert_weights = 1.0, kwargs = {'hidden_size': 32}, k = 'dtype'

    def __init__(
        self,
        ffn_act_fn: Optional[dict] = None,
        ffn_hidden_size: int = 3584,
        moe_num_experts: int = 4,
        moe_top_k: int = 1,
        moe_jitter_eps: Optional[float] = None,
        moe_loss_weight: float = 0.01,
        moe_normalize_expert_weights: Optional[float] = 1.0,
        **kwargs: Any,
    ):
        super().__init__()
        if ffn_act_fn is None:
            ffn_act_fn = {"name": "silu"}
        self.ffn_act_fn = ffn_act_fn
        self.ffn_hidden_size = ffn_hidden_size
        self.moe_num_experts = moe_num_experts
        self.moe_top_k = moe_top_k
        self.moe_jitter_eps = moe_jitter_eps
        self.moe_loss_weight = moe_loss_weight
        self.moe_normalize_expert_weights = moe_normalize_expert_weights
    
        for k in ["model_type", "attn_implementation", "transformers_version", "_commit_hash", "torch_dtype", "dtype"]:
            if k in kwargs:
                kwargs.pop(k)
        if len(kwargs) != 0:
>           raise ValueError(f"Found unknown {kwargs=}")
E           ValueError: Found unknown kwargs={'hidden_size': 32}

.venv/lib/python3.9/site-packages/transformers/models/dbrx/configuration_dbrx.py:116: ValueError

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions