Skip to content

anwer_clip_ollama: new operator to do vqa on clips #23

@geoffroy-noel-ddh

Description

@geoffroy-noel-ddh

This is similar to BVQA and the answer_transcription_ollama.

This assumes that ollama support video input and has compatible VLMs. E.g. qwen3-vl

One hurdle is finding a method to efficiently pass a video to ollama using FrameSense container architecture.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions