Skip to content

For finetuning MBart-based model, setting decoder_start_token_id in model.config is NOT ENOUGH. #41492

@Bmingg

Description

@Bmingg

System Info

Context: finetuning a MBart model with run_translation.py
Easy fix is to set it in model.generation_config as well. Both worked outside of run_translation.py, but not setting this in run_translation.py causes validation/evaluation to fail miserably.

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Run run_translation.py to finetune with a validation file.

Expected behavior

BLEU scores should be terrible due to the bug. This makes it hard to monitor finetuning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions