-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BERT #62
Comments
I ran into the same issue and made a few modifications to the way I load my fine-tuned BERT model. As far as I can tell, the newly initialized weights are only for the pooler-layer. In my case, I fine-tuned a BERT model using MLM, which doesn't train the pooling layer as it's not required for the task. In turn, when I save that model it doesn't include those parameters and when I re-load it it produces the error you mentioned. From what I understand, the GD model also does not use the pooled output and uses the I think when loading a fine-tuned BERT model, only the pooler-layer weights are newly initialized, and not the fine-tuned parameters. So in theory, I think for using GD the warning doesn't really matter. For my own sanity to ensure at least nothing is initialized randomly, I implemented this function to load my fine-tuned BERT model and re-initialize the pooler weights with the ones from a pre-trained BERT model.
Hope this helps! |
Hello ,
I introduced some parameters to the text encoder (BERT) and trained for some epochs. Everything ran smoothly.
But when evaluating using the resultant checkpoint , I'm getting this warning about BERT new params:
"Some weights of Bert Model were not initialized from model checkpoint at bert-uncased and are newly initialized"
I think it will evaluate using the vanilla bert (without these newly added params). I guess that the trained params are loaded after loading the bert model which causes this warning.
So , what should I change in the config file in order to evaluate using BERT with these introduced trained params?
Thanks in advance !
cc : @aghand0ur
The text was updated successfully, but these errors were encountered: