Skip to content

[Q] InternVideo 2.5,, TPO #61

@ilileun

Description

@ilileun

Hello, thank you for the excellent research and for sharing the code.

I understand that TPO (Task Preference Optimization) has been applied to InternVideo 2.5, and as mentioned in the related paper, it includes three task-specific heads: region, temporal, and mask.

I have two questions regarding this:

  1. Are these three heads already implemented and integrated into the current InternVideo 2.5 codebase?
  2. The paper describes a detailed multi-stage training process, but the repository currently provides only inference scripts. Will the training scripts for these heads be released in the future? Alternatively, is there any guidance or reference available to perform supervised fine-tuning (sFT) with these task heads?

Any support or clarification would be greatly appreciated. Thank you again for your valuable contribution!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions