Open
Description
When using SDXL, an error will occur if a certain prompt contains too many "!" characters.
The minimal code that reproduces the problem is below.
from diffusers import StableDiffusionXLPipeline
import torch
from compel import Compel, ReturnedEmbeddingsType
prompt = "!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!"
pipeline = StableDiffusionXLPipeline.from_pretrained(
"cagliostrolab/animagine-xl-3.1",
torch_dtype=torch.float16,
use_safetensors=True,
).to("cuda")
compel = Compel(
tokenizer=[pipeline.tokenizer, pipeline.tokenizer_2],
text_encoder=[pipeline.text_encoder, pipeline.text_encoder_2],
returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
requires_pooled=[False, True],
truncate_long_prompts=False,
)
compel.build_conditioning_tensor(prompt)
At this time, the following error occurs.
Traceback (most recent call last):
File "/home/sk-uma/create_dataset/test_compel.py", line 26, in <module>
compel.build_conditioning_tensor(prompt)
File "/opt/conda/lib/python3.10/site-packages/compel/compel.py", line 112, in build_conditioning_tensor
conditioning, _ = self.build_conditioning_tensor_for_conjunction(conjunction)
File "/opt/conda/lib/python3.10/site-packages/compel/compel.py", line 186, in build_conditioning_tensor_for_conjunction
this_conditioning, this_options = self.build_conditioning_tensor_for_prompt_object(p)
File "/opt/conda/lib/python3.10/site-packages/compel/compel.py", line 218, in build_conditioning_tensor_for_prompt_object
return self._get_conditioning_for_flattened_prompt(prompt), {}
File "/opt/conda/lib/python3.10/site-packages/compel/compel.py", line 282, in _get_conditioning_for_flattened_prompt
return self.conditioning_provider.get_embeddings_for_weighted_prompt_fragments(
File "/opt/conda/lib/python3.10/site-packages/compel/embeddings_provider.py", line 535, in get_embeddings_for_weighted_prompt_fragments
text_embeddings = torch.cat(text_embeddings_list, dim=-1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 77 but got size 154 for tensor number 1 in the list.
The problem is due to the difference between SDXL's tokenizer
and tokenizer_2
.
The problem is that pad_token
of tokenizer
is <|endoftext|>
, while pad_token
of tokenizer_2
is !
.
These tokenizers also treat consecutive !
s as one token.
For this reason, the number of tokens in the processing results of tokenizer
and tokenizer_2
is different and an error occurs.
The simplest solution is to load a similar tokenizer.
compel = Compel(
tokenizer=[pipeline.tokenizer, pipeline.tokenizer],
text_encoder=[pipeline.text_encoder, pipeline.text_encoder_2],
returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
requires_pooled=[False, True],
truncate_long_prompts=False,
)
Metadata
Metadata
Assignees
Labels
No labels