what happens with overly-long captions? #946
Replies: 2 comments 1 reply
-
I think it will work fine as long as you've set the max token length to 150 instead of 75. You can do that in the Advanced tab of the webui gui for kohya, or pass '--max_token_length=150' if you're using sd-scripts directly from the command line. That option doesn't seem to work for the samples that are generated, but that won't affect the training. There's someone asking for it to work for the samples here: |
Beta Was this translation helpful? Give feedback.
-
A bit of a late reply, but I've been wondering the same. Take all of this with a truckload of salt because I'm not a coder, just a hobbyist and could be interpreting things wrong. A look into the code shows it's truncated:
Testing this and the associated functions shows that tokens over the limit are discarded. I pulled that one from |
Beta Was this translation helpful? Give feedback.
-
during training, what happens with overly-long captions? if they tokenize to over 77 tokens, what happens to the remainder? is it truncated? treated as 2 separate chunks?
Beta Was this translation helpful? Give feedback.
All reactions