Hi team,
I’m a new intern working on the VL project in usloth. I have read through the docs here:
Video Data Format - Intern VL
but I couldn’t find instructions on how to fine-tune a model on usloth.
Could you please clarify:
- Is there any documentation or detailed guide on how to create data for training the model?
- Are there any workflows or sample scripts available to train or fine-tune the intern VL model?
- If there are any specific requirements regarding data format, preprocessing, or environment setup, I would appreciate guidance.
Thank you!