Chroma fix t5 #2203

johnr14 · 2025-09-20T17:45:58Z

Hi, I was getting an error when trying to train chroma :

[rank1]: Traceback (most recent call last):
[rank1]:   File "/INTEL_SSD/github/fluxgym/sd-scripts/flux_train_network.py", line 547, in <module>
[rank1]:     trainer.train(args)
[rank1]:   File "/INTEL_SSD/github/fluxgym/sd-scripts/train_network.py", line 1403, in train
[rank1]:     loss = self.process_batch(
[rank1]:            ^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/INTEL_SSD/github/fluxgym/sd-scripts/train_network.py", line 430, in process_batch
[rank1]:     input_ids = [ids.to(accelerator.device) for ids in batch["input_ids_list"]]
[rank1]:                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: TypeError: 'NoneType' object is not iterable
[rank0]: Traceback (most recent call last):
[rank0]:   File "/INTEL_SSD/github/fluxgym/sd-scripts/flux_train_network.py", line 547, in <module>
[rank0]:     trainer.train(args)
[rank0]:   File "/INTEL_SSD/github/fluxgym/sd-scripts/train_network.py", line 1403, in train
[rank0]:     loss = self.process_batch(
[rank0]:            ^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/INTEL_SSD/github/fluxgym/sd-scripts/train_network.py", line 430, in process_batch
[rank0]:     input_ids = [ids.to(accelerator.device) for ids in batch["input_ids_list"]]
[rank0]:                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: TypeError: 'NoneType' object is not iterable

Command :

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True accelerate launch \
  --mixed_precision bf16 \
  --num_cpu_threads_per_process 1 \
  /INTEL_SSD/github/fluxgym/sd-scripts/flux_train_network.py \
  --model_type chroma \
  --pretrained_model_name_or_path "/INTEL_SSD/github/fluxgym/models/unet/chroma_v10HD.safetensors" \
  --t5xxl "/INTEL_SSD/github/fluxgym/models/clip/t5xxl_fp16.safetensors" \
  --ae "/INTEL_SSD/github/fluxgym/models/vae/ae.sft" \
  --apply_t5_attn_mask \
  --cache_latents_to_disk \
  --cache_text_encoder_outputs \
  --cache_text_encoder_outputs_to_disk \
  --dataset_config "./dataset.toml" \
  --discrete_flow_shift 3.1582 \
  --fp8_base \
  --gradient_accumulation_steps 16 \
  --gradient_checkpointing \
  --guidance_scale 0.0 \
  --highvram \
  --learning_rate 5e-4 \
  --loss_type l2 \
  --max_data_loader_n_workers 2 \
  --max_train_epochs 15 \
  --mixed_precision bf16 \
  --model_prediction_type raw \
  --network_alpha 32 \
  --network_dim 64 \
  --network_module networks.lora_flux \
  --optimizer_type adamw8bit \
  --output_dir "../chroma" \
  --output_name chroma-neo-lora \
  --sample_every_n_steps 10 \
  --sample_prompts "../sample_prompts_chroma.txt" \
  --save_every_n_epochs 1 \
  --save_every_n_steps 10 \
  --save_last_n_steps 15 \
  --save_model_as safetensors \
  --save_precision bf16 \
  --sdpa --persistent_data_loader_workers \
  --seed 42 \
  --network_args "ggpo_sigma=0.03" "ggpo_beta=0.01" "split_qkv=True" "target_module=models.layers"      \
  --timestep_sampling sigmoid

Tried SD3 branch and fix-chroma-training-withtout-te-cache branch, both failed.

So I asked help from deepseek and got at this point in my investigation :

Looking at the code, the issue is in the process_batch method in train_network.py at line 430. The problem is that Chroma doesn't use CLIP-L, but the code assumes both text encoders are present.

Here's the specific fix needed:

In train_network.py around line 430, modify the code to handle Chroma models differently:

# Replace this line:                                                                                                                                                                                                                                                                                                                                                                          
# input_ids = [ids.to(accelerator.device) for ids in batch["input_ids_list"]]                                                                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                                                                                                                              
# With this:                                                                                                                                                                                                                                                                                                                                                                                  
if hasattr(self, 'model_type') and self.model_type == "chroma":                                                                                                                                                                                                                                                                                                                               
    # Chroma doesn't use CLIP-L, so we need to handle this differently                                                                                                                                                                                                                                                                                                                        
    # For Chroma, we should use T5 tokens instead                                                                                                                                                                                                                                                                                                                                             
    if "t5_input_ids" in batch and batch["t5_input_ids"] is not None:                                                                                                                                                                                                                                                                                                                         
        input_ids = [batch["t5_input_ids"].to(accelerator.device)]                                                                                                                                                                                                                                                                                                                            
    else:                                                                                                                                                                                                                                                                                                                                                                                     
        # Fallback: create empty list since CLIP-L isn't used                                                                                                                                                                                                                                                                                                                                 
        input_ids = []                                                                                                                                                                                                                                                                                                                                                                        
else:                                                                                                                                                                                                                                                                                                                                                                                         
    input_ids = [ids.to(accelerator.device) for ids in batch["input_ids_list"]]

However, this is a deeper architectural issue. The real problem is that the dataset preparation for Chroma needs to be handled differently. Looking at the code structure:

1 In flux_train_network.py, the FluxNetworkTrainer class needs to override the tokenization strategy for Chroma
2 The tokenization strategy should skip CLIP-L processing for Chroma models
3 The dataset should provide T5 tokens instead of CLIP tokens for Chroma

Better solution: Modify the get_tokenize_strategy method in FluxNetworkTrainer to handle Chroma differently:

def get_tokenize_strategy(self, args):                                                                                                                                                                                                                                                                                                                                                        
    if self.model_type == "chroma":                                                                                                                                                                                                                                                                                                                                                           
        # For Chroma, only use T5 tokenizer, skip CLIP-L                                                                                                                                                                                                                                                                                                                                      
        t5xxl_max_length = 512  # Chroma uses full T5 context                                                                                                                                                                                                                                                                                                                                 
        return strategy_flux.FluxTokenizeStrategy(t5xxl_max_length, args.tokenizer_cache_dir, use_clip_l=False)                                                                                                                                                                                                                                                                               
    else:                                                                                                                                                                                                                                                                                                                                                                                     
        # Original logic for FLUX models                                                                                                                                                                                                                                                                                                                                                      
        _, is_schnell, _, _ = flux_utils.analyze_checkpoint_state(args.pretrained_model_name_or_path)                                                                                                                                                                                                                                                                                         
        if args.t5xxl_max_token_length is None:                                                                                                                                                                                                                                                                                                                                               
            t5xxl_max_token_length = 256 if is_schnell else 512                                                                                                                                                                                                                                                                                                                               
        else:                                                                                                                                                                                                                                                                                                                                                                                 
            t5xxl_max_token_length = args.t5xxl_max_token_length                                                                                                                                                                                                                                                                                                                              
        return strategy_flux.FluxTokenizeStrategy(t5xxl_max_token_length, args.tokenizer_cache_dir)

You would also need to modify the FluxTokenizeStrategy to support use_clip_l=False for Chroma models, and ensure the dataset provides the correct tokenization outputs.

This is a fundamental architectural issue where Chroma support wasn't fully implemented in the training pipeline.

So I went up with it's help to fix all issues.

I just started a training run. Seems to work !

Also, the major difference from Flux Dev is that with Chroma I can batch size 4 instead of 2 on 2x3090.
VRAM is at 23.787

All images where cropped/scaled to 1024. Not sure it it makes a difference.
EDIT: Tried with bucketing and larger images : works! BUT old .npz must be deleted if they are from flux.

Not sure this code is up to standard or has not created other issues.
For now it trains, possible issues with quality due to code changes ?

Posting here to help others.

Co-authored-by: aider (deepseek/deepseek-chat) <[email protected]>

johnr14 · 2025-09-20T17:47:02Z

flux_train_network.py

+                if self.is_schnell:
+                    t5xxl_max_token_length = 256
+                else:
+                    t5xxl_max_token_length = 512


Not sure if this is OK with Chroma or should keep 256 ?

kohya-ss · 2025-09-21T00:13:28Z

It looks like you're getting an error because you're trying to train CLIP-L, which doesn't exist. Could you try training only DiT with --network_train_unet_only?

Training T5 is not recommended, but you should be able to train DiT and T5 with --network_train_unet_only and --network_args "train_t5xxl=True".

https://github.com/kohya-ss/sd-scripts/blob/sd3/docs/flux_train_network.md#66-text-encoder-lora-support--text-encoder-lora%E3%81%AE%E3%82%B5%E3%83%9D%E3%83%BC%E3%83%88

johnr14 · 2025-09-23T13:42:15Z

                    INFO     create LoRA for Text Encoder 1:                                                                                                                                                                                                                                                                        lora_flux.py:939
                    INFO     create LoRA for Text Encoder 1: 0 modules.

I think it's not training text encoder.
Running training currently, can't check again, but pretty sure I tried my command on sd3 and it failed.

New version of args :

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True accelerate launch \
  --mixed_precision bf16 \
  --num_cpu_threads_per_process 1 \
  /INTEL_SSD/github/fluxgym/sd-scripts/flux_train_network.py \
  --model_type chroma \
  --pretrained_model_name_or_path "models/unet/chroma_v10HD.safetensors" \
  --t5xxl "models/clip/t5xxl_fp16.safetensors" \
  --ae "models/vae/ae.sft" \
  --apply_t5_attn_mask \
  --cache_latents_to_disk \
  --cache_text_encoder_outputs \
  --cache_text_encoder_outputs_to_disk \
  --dataset_config "outputs/neo_dataset.toml" \
  --discrete_flow_shift 3.1582 \
  --fp8_base \
  --gradient_accumulation_steps 16 \
  --gradient_checkpointing \
  --guidance_scale 0.0 \
  --highvram --mem_eff_attn \
  --t5xxl_max_token_length 512 \
  --learning_rate 7e-6 \
  --lr_scheduler cosine_with_restarts --lr_scheduler_min_lr_ratio 1e-8 --lr_warmup_steps 5 --lr_scheduler_num_cycles 50 \
  --loss_type l2 \
  --max_data_loader_n_workers 2 \
  --max_train_epochs 50 \
  --mixed_precision bf16 \
  --model_prediction_type raw \
  --network_alpha 32 \
  --network_dim 64 \
  --network_module networks.lora_flux \
  --optimizer_type adamw8bit \
  --output_dir "outputs/neo_chroma_test3" \
  --output_name chroma-lr7e-6 \
  --sample_every_n_steps 15 \
  --sample_every_n_epochs 1 \
  --save_every_n_epochs 1 \
  --sample_prompts "sample_prompts_chroma.txt" \
  --save_every_n_steps 15 \
  --save_last_n_steps 45 \
  --save_state \
  --save_last_n_steps_state 45 \
  --save_state_on_train_end \
  --network_train_unet_only \
  --save_model_as safetensors \
  --save_precision bf16 \
  --xformers --persistent_data_loader_workers \
  --seed 42 \
  --network_args "ggpo_sigma=0.03" "ggpo_beta=0.01" "split_qkv=True" "target_module=models.layers"  \
  --timestep_sampling sigmoid \
  --sample_at_first

added --network_train_unet_only but seems like it made no change

Don't know if I am the only one having problem with original sd3 on Chroma HD.

Here is my pip freeze

absl-py==2.3.1
accelerate==1.6.0
addict==2.4.0
adlfs==2025.8.0
aiobotocore==2.24.2
aiofiles==23.2.1
aiohappyeyeballs==2.6.1
aiohttp==3.11.18
aioitertools==0.12.0
aiosignal==1.3.2
airportsdata==20250224
albucore==0.0.24
albumentations==2.0.8
annotated-types==0.7.0
anthropic==0.47.2
antlr4-python3-runtime==4.9.3
anyio==4.9.0
art==6.5
astor==0.8.1
async-timeout==5.0.1
attrs==25.3.0
autoawq==0.2.7.post3
axolotl==0.12.2
axolotl-contribs-lgpl==0.0.6
axolotl-contribs-mit==0.0.5
azure-core==1.35.0
azure-datalake-store==0.0.53
azure-identity==1.24.0
azure-storage-blob==12.26.0
backoff==2.2.1
beautifulsoup4==4.13.5
bespokelabs-curator==0.1.23
bitsandbytes==0.47.0
blake3==1.0.5
blis==1.3.0
botocore==1.40.18
braceexpand==0.1.7
Brotli==1.1.0
cachetools==5.5.2
calflops==0.3.2
catalogue==2.0.10
certifi==2025.8.3
cffi==1.17.1
chardet==5.2.0
charset-normalizer==3.4.2
circuitbreaker==2.1.3
clean-fid==0.1.35
click==8.1.8
clip-anytorch==2.6.0
cloudpathlib==0.22.0
cloudpickle==3.1.1
cmake==4.1.0
colorama==0.4.6
coloredlogs==15.0.1
compressed-tensors==0.9.3
confection==0.1.5
contourpy==1.3.2
controlnet_aux==0.0.7
cryptography==44.0.3
cupy-cuda12x==13.4.1
cycler==0.12.1
cymem==2.0.11
DataProperty==1.1.0
datasets==4.0.0
datatrove==0.6.0
dctorch==0.1.2
decorator==5.2.1
deepspeed==0.17.2
deepspeed-kernels==0.0.1.dev1698255861
densemixer==1.0.1
Deprecated==1.2.18
depyf==0.18.0
diffuser==0.0.1
diffusers==0.32.1
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
dnspython==2.7.0
docker-pycreds==0.4.0
docstring_parser==0.16
einops==0.8.1
email_validator==2.2.0
eval_type_backport==0.2.2
evaluate==0.4.1
fastapi==0.115.12
fastapi-cli==0.0.7
fastcore==1.8.8
fastrlock==0.8.3
ffmpy==0.6.1
filelock==3.19.1
fire==0.7.1
flash_attn==2.8.2
flatten-json==0.1.14
fonttools==4.57.0
frozenlist==1.6.0
fsspec==2025.3.0
ftfy==6.3.1
gcsfs==2024.12.0
gguf==0.16.3
gitdb==4.0.12
GitPython==3.1.45
google-api-core==2.24.2
google-auth==2.39.0
google-auth-oauthlib==1.2.2
google-cloud-aiplatform==1.71.1
google-cloud-bigquery==3.31.0
google-cloud-core==2.4.3
google-cloud-resource-manager==1.14.2
google-cloud-storage==2.19.0
google-crc32c==1.7.1
google-resumable-media==2.7.2
googleapis-common-protos==1.70.0
gradio==4.44.1
gradio_client==1.3.0
gradio_logsview @ https://huggingface.co/spaces/cocktailpeanut/gradio_logsview/resolve/main/gradio_logsview-0.0.17-py3-none-any.whl#sha256=c595d1ab65cc04b009b538652f64dcdb2e9d30491ee0ffc12f25101da5e77042
groovy==0.1.2
grpc-google-iam-v1==0.14.2
grpcio==1.72.0
grpcio-status==1.71.0
grpclib==0.4.7
h11==0.16.0
h2==4.3.0
hf-xet==1.1.5
hf_transfer==0.1.9
hjson==3.1.0
hpack==4.1.0
httpcore==1.0.9
httptools==0.6.4
httpx==0.28.1
huggingface-hub==0.34.3
humanfriendly==10.0
humanize==4.12.3
hyperframe==6.1.0
idna==3.10
imageio==2.37.0
imagesize==1.4.1
immutabledict==4.2.0
importlib_metadata==8.0.0
importlib_resources==6.5.2
instructor==1.7.9
interegular==0.3.3
invisible-watermark==0.2.0
isodate==0.7.2
jd==0.0.1
Jinja2==3.1.6
jiter==0.8.2
jmespath==1.0.1
joblib==1.4.2
jsonlines==4.0.0
jsonmerge==1.9.2
jsonschema==4.23.0
jsonschema-specifications==2025.4.1
k-diffusion==0.1.1.post1
kernels==0.9.0
kiwisolver==1.4.8
kornia==0.8.1
kornia_rs==0.1.9
langcodes==3.5.0
langdetect==1.0.9
language_data==1.3.0
lark==1.2.2
lazy_loader==0.4
-e git+https://github.com/kohya-ss/sd-scripts@b4ceeb7c75caa9b0bc356e670dee569558c6e52c#egg=library
liger_kernel==0.6.1
lightning-utilities==0.15.2
lion-pytorch==0.2.3
litellm==1.67.2
llguidance==0.7.20
llvmlite==0.44.0
lm-format-enforcer==0.10.11
lm_eval==0.4.7
loguru==0.7.3
lpips==0.1.4
lxml==6.0.1
lycoris_lora==1.8.3
marisa-trie==1.3.1
Markdown==3.8
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib==3.10.1
mbstrdecoder==1.1.4
mdurl==0.1.2
mdv==1.7.5
mistral_common==1.8.3
mistralai==1.7.0
modal==1.0.2
monotonic==1.6
more-itertools==10.8.0
mpmath==1.3.0
msal==1.33.0
msal-extensions==1.3.1
msgpack==1.1.0
msgspec==0.19.0
multidict==6.4.3
multiprocess==0.70.16
murmurhash==1.0.13
nest-asyncio==1.6.0
networkx==3.5
ninja==1.11.1.4
nltk==3.9.1
numba==0.61.2
numexpr==2.11.0
numpy==2.3.3
nvidia-cublas-cu12==12.8.4.1
nvidia-cuda-cupti-cu12==12.8.90
nvidia-cuda-nvrtc-cu12==12.8.93
nvidia-cuda-runtime-cu12==12.8.90
nvidia-cudnn-cu12==9.10.2.21
nvidia-cufft-cu12==11.3.3.83
nvidia-cufile-cu12==1.13.1.3
nvidia-curand-cu12==10.3.9.90
nvidia-cusolver-cu12==11.7.3.90
nvidia-cusparse-cu12==12.5.8.93
nvidia-cusparselt-cu12==0.7.1
nvidia-ml-py==12.560.30
nvidia-nccl-cu12==2.27.3
nvidia-nvjitlink-cu12==12.8.93
nvidia-nvtx-cu12==12.8.90
oauthlib==3.3.1
oci==2.159.1
ocifs==1.3.2
omegaconf==2.3.0
open_clip_torch==3.1.0
openai==1.76.0
opencv-python==4.10.0.84
opencv-python-headless==4.11.0.86
opentelemetry-api==1.26.0
opentelemetry-exporter-otlp==1.26.0
opentelemetry-exporter-otlp-proto-common==1.26.0
opentelemetry-exporter-otlp-proto-grpc==1.26.0
opentelemetry-exporter-otlp-proto-http==1.26.0
opentelemetry-proto==1.26.0
opentelemetry-sdk==1.26.0
opentelemetry-semantic-conventions==0.47b0
opentelemetry-semantic-conventions-ai==0.4.9
optimum==1.16.2
optimum-quanto==0.2.7
orjson==3.11.3
outlines==0.1.11
outlines_core==0.1.26
oyaml==1.0
packaging==23.2
pandas==2.2.2
pandoc==2.4
partial-json-parser==0.2.1.1.post5
pathvalidate==3.3.1
peft==0.17.1
pillow==10.4.0
platformdirs==4.4.0
plumbum==1.9.0
ply==3.11
portalocker==3.2.0
posthog==3.25.0
preshed==3.0.10
prodigy-plus-schedule-free==1.9.2
prodigyopt==1.1.2
prometheus-fastapi-instrumentator==7.1.0
prometheus_client==0.22.0
propcache==0.3.1
proto-plus==1.26.1
protobuf==4.25.7
psutil==5.9.8
py-cpuinfo==9.0.0
pyarrow==19.0.1
pyarrow-hotfix==0.7
pyasn1==0.6.1
pyasn1_modules==0.4.2
pybind11==3.0.1
pycountry==24.6.1
pycparser==2.22
pydantic==2.11.9
pydantic-extra-types==2.10.5
pydantic_core==2.33.2
pydub==0.25.1
Pygments==2.19.1
PyJWT==2.10.1
pyOpenSSL==24.3.0
pyparsing==3.2.3
pytablewriter==1.2.1
python-dateutil==2.9.0.post0
python-dotenv==1.1.1
python-json-logger==3.3.0
python-multipart==0.0.20
python-slugify==8.0.4
pytorch-fid==0.3.0
pytorch_optimizer==3.7.0
pytz==2025.2
PyWavelets==1.9.0
PyYAML==6.0.2
pyzmq==26.4.0
ray==2.46.0
referencing==0.36.2
regex==2025.7.34
requests==2.32.4
requests-oauthlib==2.0.0
responses==0.18.0
rich==14.1.0
rich-toolkit==0.14.6
ring-flash-attn==0.1.7
rouge_score==0.1.2
rpds-py==0.24.0
rsa==4.9.1
ruff==0.12.12
s3fs==2024.12.0
sacrebleu==2.5.1
safehttpx==0.1.6
safetensors==0.4.5
schedulefree==1.4
scikit-image==0.25.2
scikit-learn==1.4.2
scipy==1.15.3
semantic-version==2.10.0
sentencepiece==0.2.1
sentry-sdk==2.37.0
setproctitle==1.3.7
shapely==2.1.0
shellingham==1.5.4
sigtools==4.0.1
simsimd==6.5.3
six==1.17.0
smart_open==7.3.1
smmap==5.0.2
sniffio==1.3.1
soupsieve==2.8
spaces==0.41.0
spacy==3.8.7
spacy-legacy==3.0.12
spacy-loggers==1.0.5
sqlitedict==2.1.0
srsly==2.5.1
starlette==0.46.2
stringzilla==4.0.11
sympy==1.14.0
synchronicity==0.9.16
tabledata==1.3.4
tabulate==0.9.0
tcolorpy==0.1.7
tenacity==9.1.2
tensorboard==2.20.0
tensorboard-data-server==0.7.2
tensorboardX==2.6.4
termcolor==3.1.0
text-unidecode==1.3
thinc==8.3.6
threadpoolctl==3.6.0
tifffile==2025.9.9
tiktoken==0.9.0
timm==1.0.19
tokenizers==0.22.1
toml==0.10.2
tomlkit==0.12.0
torch==2.8.0
torchao==0.12.0
torchaudio==2.7.0
torchdiffeq==0.2.5
torchmetrics==1.8.2
torchsde==0.2.6
torchvision==0.23.0
tqdm==4.67.1
tqdm-multiprocess==0.0.11
trackio==0.2.7
trampoline==0.1.2
transformers==4.56.2
triton==3.4.0
trl==0.21.0
typepy==1.3.4
typer==0.15.2
types-certifi==2021.10.8.3
types-toml==0.10.8.20240310
typing-inspection==0.4.0
typing_extensions==4.15.0
tzdata==2025.2
unsloth @ git+https://github.com/unslothai/unsloth.git@35ce4227ba8de7a0ae6bac74b3b8d8970fc150d9
unsloth_zoo @ git+https://github.com/unslothai/unsloth_zoo.git@3f336550da0d94d19b17dce7d8923e2405831603
urllib3==2.5.0
uvicorn==0.34.2
uvloop==0.21.0
vertexai==1.71.1
vllm==0.8.4
voluptuous==0.15.2
wandb==0.21.3
wasabi==1.1.3
watchfiles==1.0.5
wcwidth==0.2.13
weasel==0.4.1
webdataset==1.0.2
websockets==12.0
Werkzeug==3.1.3
word2number==1.1
wrapt==1.17.2
xformers==0.0.32.post2
xgrammar==0.1.18
xxhash==3.5.0
yarl==1.20.0
yunchang==0.6.0
zipp==3.21.0
zstandard==0.22.0

kohya-ss · 2025-09-23T14:41:04Z

I tested the latest command you shared and I get the error AttributeError: 'ModuleList' object has no attribute 'weight'. Is this happening to you?

It seems that GGPO and split_qkv cannot be used at the same time, so try removing one of them.

johnr14 · 2025-09-24T03:55:09Z

I have had a successful run using command provided with this branch.
split_qkv worked.

Didn't get any error.

Sorry, can't help much, got barely above basic coding skills and help from LLMs to fix it on my end.

Chroma training could be the next SDXL unless Qwen beats it to it, so more people will try fine tuning it by then.

So you can put this on hold until more feedback is provided, don't want to make you waste time.
Thanks a lot and great work !

johnr14 and others added 5 commits September 20, 2025 08:24

feat: add Chroma model support with CLIP-L token handling

af63e54

Co-authored-by: aider (deepseek/deepseek-chat) <[email protected]>

fix: handle missing tokenizers in FluxTokenizeStrategy with fallbacks

3040b31

Co-authored-by: aider (deepseek/deepseek-chat) <[email protected]>

refactor: improve tokenizer access robustness with fallback

b55dc7d

Co-authored-by: aider (deepseek/deepseek-chat) <[email protected]>

fix: handle Chroma case where input_ids_list is None

c9f7628

Co-authored-by: aider (deepseek/deepseek-chat) <[email protected]>

fix: remove duplicate else clause causing syntax error

10ff863

Co-authored-by: aider (deepseek/deepseek-chat) <[email protected]>

johnr14 commented Sep 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Chroma fix t5 #2203

Chroma fix t5 #2203

Uh oh!

johnr14 commented Sep 20, 2025 •

edited

Loading

Uh oh!

johnr14 Sep 20, 2025

Uh oh!

kohya-ss commented Sep 21, 2025

Uh oh!

johnr14 commented Sep 23, 2025 •

edited

Loading

Uh oh!

kohya-ss commented Sep 23, 2025

Uh oh!

johnr14 commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Chroma fix t5 #2203

Are you sure you want to change the base?

Chroma fix t5 #2203

Uh oh!

Conversation

johnr14 commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

So I asked help from deepseek and got at this point in my investigation :

Uh oh!

johnr14 Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

kohya-ss commented Sep 21, 2025

Uh oh!

johnr14 commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kohya-ss commented Sep 23, 2025

Uh oh!

johnr14 commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

johnr14 commented Sep 20, 2025 •

edited

Loading

johnr14 commented Sep 23, 2025 •

edited

Loading