You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hello, I would be grateful if someone answer this question clearly: Can dialogpt finetuned on model other than GPT-2, if so, how?.
I tried to finetune this model to GPT-J, as I changed the LSP_train.py line 195 from model = load_model(GPT2LMHeadModel(config), args.init_checkpoint, args, verbose=True)
to model = load_model(GPTJForCausalLM.from_pretrained('EleutherAI/gpt-j-6B),args.init_checkpoint, args,verbose=True)
but get this error: File "LSP_train.py", line 287, in <module> loss, ppl = model(input_ids, position_ids, token_ids, label_ids) File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/transformers/models/gptj/modeling_gptj.py", line 832, in forward return_dict=return_dict, File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/transformers/models/gptj/modeling_gptj.py", line 589, in forward past_length = past_key_values[0][0].size(-2) IndexError: dimension specified as -2 but tensor has no dimensions
The script above get an error when I'm using either GPU or CPU, but it's working fine on gpt-2 model.
Would appreciate any help!
The text was updated successfully, but these errors were encountered:
hello, I would be grateful if someone answer this question clearly:
Can dialogpt finetuned on model other than GPT-2, if so, how?.
I tried to finetune this model to GPT-J, as I changed the
LSP_train.py
line 195 frommodel = load_model(GPT2LMHeadModel(config), args.init_checkpoint, args, verbose=True)
to
model = load_model(GPTJForCausalLM.from_pretrained('EleutherAI/gpt-j-6B),args.init_checkpoint, args,verbose=True)
but get this error:
File "LSP_train.py", line 287, in <module> loss, ppl = model(input_ids, position_ids, token_ids, label_ids) File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/transformers/models/gptj/modeling_gptj.py", line 832, in forward return_dict=return_dict, File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/envs/dialogpt/lib/python3.7/site-packages/transformers/models/gptj/modeling_gptj.py", line 589, in forward past_length = past_key_values[0][0].size(-2) IndexError: dimension specified as -2 but tensor has no dimensions
The script above get an error when I'm using either GPU or CPU, but it's working fine on gpt-2 model.
Would appreciate any help!
The text was updated successfully, but these errors were encountered: