Training based on Teacher forcing technique #28

omidvaramin · 2022-03-13T02:46:45Z

Hi,
Thank you for your code,
I have a question regarding the way the model is being trained,
In the paper it is mentioned T5 is being trained based on the teacher forcing technique which for each time stamp in the decoding part the input should be from the ground truth data not the previously generated token, but in your code your model will generate the entire output by itself trough the following line:
outputs = model(input_ids = ids, attention_mask = mask, decoder_input_ids=y_ids, lm_labels=lm_labels)
loss = outputs[0]
Is my assumption correct that you do not use teacher forcing technique? thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training based on Teacher forcing technique #28

Training based on Teacher forcing technique #28

omidvaramin commented Mar 13, 2022

Training based on Teacher forcing technique #28

Training based on Teacher forcing technique #28

Comments

omidvaramin commented Mar 13, 2022