modernbert logits do not have gradient #35386

andersonbcdefg · 2024-12-21T15:15:26Z

System Info

latest transformers version (from source), python 3.10

Who can help?

@ArthurZ

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

model_id = "answerdotai/ModernBERT-base"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id).to("cuda")
# Create a simple input
inputs = {
"input_ids": torch.randint(0, 1000, (1, 10)).cuda(),
"attention_mask": torch.ones(1, 10).cuda()
}

# Set to train mode and check all parameters
model.train()
for name, param in model.named_parameters():
    print(f"{name}: requires_grad = {param.requires_grad}")

# Do forward pass
outputs = model(**inputs)
print("\nOutput logits requires_grad:", outputs.logits.requires_grad)
print("Output logits grad_fn:", outputs.logits.grad_fn)

Expected behavior

When I do this, the output is:

Output logits requires_grad: False
Output logits grad_fn: None

Despite explicitly setting all the parameters to requires_grad = True! And when printing all the params, they all are correctly set to requires_grad = True.

Just to sanity check, I ran the same code but set model_id = "bert-base-uncased", and got:

Output logits requires_grad: True
Output logits grad_fn: <ViewBackward0 object at 0x7f0ca6abf370>

So it's def a ModernBERT specific problem!

The text was updated successfully, but these errors were encountered:

warner-benjamin · 2024-12-23T20:21:57Z

This is a bug, and I have a fix for it in #35404.

github-actions · 2025-02-03T08:03:46Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

NielsRogge · 2025-02-03T08:14:42Z

Closing this was as it was fixed thanks to the PR above.

andersonbcdefg added the bug label Dec 21, 2024

warner-benjamin mentioned this issue Dec 23, 2024

ModernBERT bug fixes #35404

Merged

tomaarsen mentioned this issue Jan 9, 2025

update modular_modernbert -- add inputs_embeds param to ModernBertModel #35373

Merged

NielsRogge closed this as completed Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modernbert logits do not have gradient #35386

modernbert logits do not have gradient #35386

andersonbcdefg commented Dec 21, 2024 •

edited by tomaarsen

Loading

warner-benjamin commented Dec 23, 2024

github-actions bot commented Feb 3, 2025

NielsRogge commented Feb 3, 2025

modernbert logits do not have gradient #35386

modernbert logits do not have gradient #35386

Comments

andersonbcdefg commented Dec 21, 2024 • edited by tomaarsen Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

warner-benjamin commented Dec 23, 2024

github-actions bot commented Feb 3, 2025

NielsRogge commented Feb 3, 2025

andersonbcdefg commented Dec 21, 2024 •

edited by tomaarsen

Loading