-
Notifications
You must be signed in to change notification settings - Fork 75
Description
I have been working with the Infinity model and encountered some issues during fine-tuning. I would greatly appreciate your insights on the following:
-
When performing full fine-tuning of the Infinity model, the training speed appears to be significantly slow, compared with diffusion models. Could you advise on which parts of the model might be causing this slowdown?
-
Interestingly, when I fine-tune only a subset of the parameters (as opposed to the full model), the training speed remains roughly the same as with full fine-tuning. Typically, I would expect faster training when fewer parameters are updated. Could you explain why this happens?
-
Lastly, when I fine-tune a subset of the parameters, I expect I can increase a batch size, compared with full fine-tuning. However, increasing the batch size results in an out-of-memory (OOM) error. Why might this occur, even though fewer parameters are being trained?
Thank you in advance for your support.