Hello,
I am trying to use the Hugging Face model OpenGVLab/InternVideo2-Stage1-1B-224p-K400 with the transformers library for video feature extraction.
When I call:
from transformers import AutoImageProcessor
processor = AutoImageProcessor.from_pretrained("OpenGVLab/InternVideo2-Stage1-1B-224p-K400")
I get the error:
OSError: Can't load image processor for 'OpenGVLab/InternVideo2-Stage1-1B-224p-K400'.
... no preprocessor_config.json file
Looking at the repo, it only contains:
.gitattributes
1B_ft_k710_ft_k400_f16.pth
1B_ft_k710_ft_k400_f8.pth
README.md
There is no config.json or preprocessor_config.json.
This makes it incompatible with AutoImageProcessor / AutoVideoProcessor.
Request
Could you add the appropriate processor/config files (e.g. preprocessor_config.json, config.json) so the model can be loaded via transformers?
Or provide guidance on the recommended way to preprocess inputs for this model when using Hugging Face.
Thanks a lot for releasing this model!