Support masking of different and repeating roles DataCollatorForCompletionOnlyLM

### Feature request

Now DataCollatorForCompletionOnlyLM assumes that there is the same number of Questions and Answers and they go strictly one after another. I want this class to support all the variants of QAAQA, QAQQA and moreover support multiple _instruction_template_ (e.g. "user" and "tool")
There are 2 problems with implementation: 1 arises when using tools or anything more then Question or Answer templates; 2 arises when your sequence is not strictly QAQAQA (e.g. QAAQA or QAQQA)


### Motivation

Fine-tuning models on agents' trajectories is becoming more and more popular, yet there is no simple tokenizators even for well-known and industry-standard formats such as working with OpenAI's format (see [here for Tool Calling](https://platform.openai.com/docs/guides/fine-tuning#fine-tuning-examples))
Now simple things like QAAQA or QAQQA are not supported as well as any use of tools/funciton calls/etc.
It was also mentioned a few times already - https://github.com/huggingface/trl/issues/1994 and https://github.com/huggingface/trl/issues/2545 with a few people asking for updates

### Your contribution

I have already changed the masking algorithm for local use, so I will try to make a PR for that today or this later week. I am not sure tho about the speed of that so it will need to be checked.
1st problem would only be fixed if we allow it to work with lists of different Qs (question starting templates) e.g. [<|im_start|>tool", "<|im_start|>user]
2nd problem would only be fixed if we change the masking algorithm (e.g. mask all and unmask between A&Q, instead of mask everything between Q&A)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support masking of different and repeating roles DataCollatorForCompletionOnlyLM #3223

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support masking of different and repeating roles DataCollatorForCompletionOnlyLM #3223

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions