Should the prompts (<|im_start|>user ....) part should counted when the loss is calculated?
The prompts are the context, which will not be changed for specific task, during the inference. Should the prompts parttern be learned during the training?
Namely,should we make modification:
tgt = ids0 + tgt_audio + ids1
to
tgt = [IGNORE_TOKEN_ID] * len(ids0) + tgt_audio + [IGNORE_TOKEN_ID] * len(ids1)
in extractor_touch_asu