Gemma 3 and Wan2.1 #357
DCVirtualCosmos
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
With the release of those powerful models, one could start dreaming of a tool to caption videos to train Wan LoRAs. And Gemma 3 seems the perfect tool. It's pretty smart, follow instruction quite well, there are versions of it uncensored already on hugging face, and it can analyze perfectly a sequence of images to describe what is happening in a short video.
So, it would be nice if Taggui could:
I will try to do this myself when I got time, but perhaps you are faster!
Beta Was this translation helpful? Give feedback.
All reactions