Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modernbert logits do not have gradient #35386

Closed
3 of 4 tasks
andersonbcdefg opened this issue Dec 21, 2024 · 3 comments
Closed
3 of 4 tasks

modernbert logits do not have gradient #35386

andersonbcdefg opened this issue Dec 21, 2024 · 3 comments
Labels

Comments

@andersonbcdefg
Copy link

andersonbcdefg commented Dec 21, 2024

System Info

latest transformers version (from source), python 3.10

Who can help?

@ArthurZ

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

model_id = "answerdotai/ModernBERT-base"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id).to("cuda")
# Create a simple input
inputs = {
"input_ids": torch.randint(0, 1000, (1, 10)).cuda(),
"attention_mask": torch.ones(1, 10).cuda()
}

# Set to train mode and check all parameters
model.train()
for name, param in model.named_parameters():
    print(f"{name}: requires_grad = {param.requires_grad}")

# Do forward pass
outputs = model(**inputs)
print("\nOutput logits requires_grad:", outputs.logits.requires_grad)
print("Output logits grad_fn:", outputs.logits.grad_fn)

Expected behavior

When I do this, the output is:

Output logits requires_grad: False
Output logits grad_fn: None

Despite explicitly setting all the parameters to requires_grad = True! And when printing all the params, they all are correctly set to requires_grad = True.

Just to sanity check, I ran the same code but set model_id = "bert-base-uncased", and got:

Output logits requires_grad: True
Output logits grad_fn: <ViewBackward0 object at 0x7f0ca6abf370>

So it's def a ModernBERT specific problem!

@warner-benjamin
Copy link
Contributor

This is a bug, and I have a fix for it in #35404.

Copy link

github-actions bot commented Feb 3, 2025

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@NielsRogge
Copy link
Contributor

Closing this was as it was fixed thanks to the PR above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants