Skip to content

Training based on Teacher forcing technique #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
omidvaramin opened this issue Mar 13, 2022 · 0 comments
Open

Training based on Teacher forcing technique #28

omidvaramin opened this issue Mar 13, 2022 · 0 comments

Comments

@omidvaramin
Copy link

Hi,
Thank you for your code,
I have a question regarding the way the model is being trained,
In the paper it is mentioned T5 is being trained based on the teacher forcing technique which for each time stamp in the decoding part the input should be from the ground truth data not the previously generated token, but in your code your model will generate the entire output by itself trough the following line:
outputs = model(input_ids = ids, attention_mask = mask, decoder_input_ids=y_ids, lm_labels=lm_labels)
loss = outputs[0]
Is my assumption correct that you do not use teacher forcing technique? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant