-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Hello, thank you for the excellent research and for sharing the code.
I understand that TPO (Task Preference Optimization) has been applied to InternVideo 2.5, and as mentioned in the related paper, it includes three task-specific heads: region, temporal, and mask.
I have two questions regarding this:
- Are these three heads already implemented and integrated into the current InternVideo 2.5 codebase?
- The paper describes a detailed multi-stage training process, but the repository currently provides only inference scripts. Will the training scripts for these heads be released in the future? Alternatively, is there any guidance or reference available to perform supervised fine-tuning (sFT) with these task heads?
Any support or clarification would be greatly appreciated. Thank you again for your valuable contribution!
Metadata
Metadata
Assignees
Labels
No labels