T5 fine-tuning for summarization decoder_input_ids and labels #15

marcoabrate · 2020-10-13T10:08:50Z

i was trying to implement the fine tuning of T5 as explained in your notebook.
in addition to have implemented the same structure as you, i have made some experiments with the HuggingFace Trainer class. the decoder_input_ids and labels parameters are not very clear to me. when you train the model, you do this

y = data['target_ids'].to(device, dtype = torch.long)
y_ids = y[:, :-1].contiguous()
lm_labels = y[:, 1:].clone().detach()
lm_labels[y[:, 1:] == tokenizer.pad_token_id] = -100

where y_ids is the decoder_input_ids. i don't understand why we need these preprocessing. i kindly ask you why are you skipping the last token of the target_ids, and why are you replacing the pads with -100 in the labels?
when i use the HuggingFace Trainer i need to tweak the __getitem__ function of the DataLoader like this

def __getitem__(self, idx):

    ...

    item['decoder_input_ids'] = y[:-1]
    lbl = y[:-1].clone()
    lbl[y[1:] == self.tokenizer.pad_token_id] = -100
    item['labels'] = lbl

    return item

otherwise the loss function does not decrease over time.

thank you for your help!

The text was updated successfully, but these errors were encountered:

Gorodecki · 2020-12-28T15:52:29Z

Hi, @marcoabrate!
I am also having trouble calculating loss. Can you share the full code for your training? Have you used multiGPU?

marcoabrate · 2021-01-03T09:43:26Z

Hi @Gorodecki
I have abandoned this code since there are a lot of seq2seq training and testing examples in the HuggingFace library itself, you can check them out here: https://github.com/huggingface/transformers/tree/master/examples/seq2seq
I was not using multiGPU. Hope this help!

QuetzalcoatlRosso · 2021-05-10T05:35:35Z

@Gorodecki , @marcoabrate : so far I have found this one useful:
https://github.com/huggingface/notebooks/blob/master/examples/summarization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T5 fine-tuning for summarization decoder_input_ids and labels #15

T5 fine-tuning for summarization decoder_input_ids and labels #15

marcoabrate commented Oct 13, 2020

Gorodecki commented Dec 28, 2020 •

edited

Loading

marcoabrate commented Jan 3, 2021

QuetzalcoatlRosso commented May 10, 2021

T5 fine-tuning for summarization decoder_input_ids and labels #15

T5 fine-tuning for summarization decoder_input_ids and labels #15

Comments

marcoabrate commented Oct 13, 2020

Gorodecki commented Dec 28, 2020 • edited Loading

marcoabrate commented Jan 3, 2021

QuetzalcoatlRosso commented May 10, 2021

Gorodecki commented Dec 28, 2020 •

edited

Loading