HI authors,
I appreciate your amazing work.
I am using InternVideo2-1B as my work backbone. And want to train a clip model with our customized dataset.
Could you provide the scripts about how to fintune a clip model from InternVideo2-1B checkpoint.
Or any instructions how to do that.
Thank you so much.