Flux: Finetuning a Finetune with vs without the Original Text Encoders that were Trained? #1765
kmacmcfarlane
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been trying to finetune the new pixelwave-dev-3 checkpoint with my dataset, and I have not been getting great results. I suspect it's because of clip/t5 training I imagine happened in the pixelwave training. I could be off-base there.
What I encountered on the huggingface repo is the t5 encoder split into two parts (diffusers style?) and I wondered if it was compatible for kohya training or if it requires conversion into one file? https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03/tree/main/text_encoder_2
For reference, these are my training scripts in case there's some other obvious problem with the training script that could explain the poor training. For instance, I'm seeing faces are a blurry mass and it's overall sort of messy. The sample images are particularly crazy, and I thought maybe the
scale
parameter was set wrong (I've been dabbling with de-distilled and might have changed it? I cannot find examples of how it should be set for flux in a toml file or a documented explaination on how to use the param on flux. Does it represent flux guidance automatically on flux models and CFG on non-flux models?)config.toml
dataset.toml
sample_promp.toml
Beta Was this translation helpful? Give feedback.
All reactions